magento - Update product default image to first image in image gallery - magento

I have an import script that imports well over 2000+ products including their images. I run this script via CLI because I feel that this is the best way to go speed-wise even though I have the same import script available and executable at the magento admin as an extension. The script runs pretty well. Almost perfect! However, sometimes the addToImageGallery somehow malfunctions and results into some images having No Image as the default product image and the only other image as not selected as defaults at all. How do I mass-update all products to set the first image in the media gallery for the product to the default 'base', 'image' and 'thumbnail' image(s)?

I found a couple of tricks on doing this (and more) on this link:
http://www.magentocommerce.com/boards/viewthread/59440/ (Thanks transio!)
Although, for Magento 1.6.2.0 (which I use), the first SQL trick there (Trick 1 - Auto-set default base, thumb, small image to first image.) needs a bit of modification.
On the second-to-the last-line there is a AND ev.attribute_id IN (70, 71, 72) part. This should point to attribute ID's which will probably not be relevant in Magento 1.6.2.0 anymore. To fix this, using any MySQL query tool (PHPMyAdmin or MySQL Query Browser), I took a look at the catalog_product_entity_varchar table. There should be entries like:
value_id, entity_type_id, attribute_id, store_id, entity_id, value
..
146649, 4, 116, 0, 1, '2'
146650, 4, 76, 0, 1, ''
146651, 4, 78, 0, 1, ''
146652, 4, 79, 0, 1, '/B/0/B05-01.jpg'
146653, 4, 80, 0, 1, '/B/0/B05-01.jpg'
146654, 4, 81, 0, 1, '/B/0/B05-01.jpg'
146655, 4, 96, 0, 1, ''
146656, 4, 100, 0, 1, ''
146657, 4, 102, 0, 1, 'container2'
..
My money was on the group of three image paths as possible replacements. So the resulting SQL now should be:
UPDATE catalog_product_entity_media_gallery AS mg,
catalog_product_entity_media_gallery_value AS mgv,
catalog_product_entity_varchar AS ev
SET ev.value = mg.value
WHERE mg.value_id = mgv.value_id
AND mg.entity_id = ev.entity_id
AND ev.attribute_id IN (79, 80, 81) # <-- attribute IDs updated here
AND mgv.position = 1;
So I committed to it, ran it and.. presto! All fixed! You might also want to encapsulate this in a transaction if you want. But this is out of this question's scope.
Well, this is the fix that worked for me so far! If there are any more out there, please share!

There was:
146652, 4, 79, 0, 1, '/B/0/B05-01.jpg'
146653, 4, 80, 0, 1, '/B/0/B05-01.jpg'
146654, 4, 81, 0, 1, '/B/0/B05-01.jpg'
So it should be:
AND ev.attribute_id IN (79, 80, 81) # <-- attribute IDs updated here
instead of:
AND ev.attribute_id IN (78, 80, 81) # <-- attribute IDs updated here
Is looking for something similar.

UPDATE catalog_product_entity_media_gallery AS mg,
catalog_product_entity_media_gallery_value AS mgv,
catalog_product_entity_varchar AS ev
SET ev.value = mg.value
WHERE mg.value_id = mgv.value_id
AND mg.entity_id = ev.entity_id
AND ev.attribute_id IN (79, 80, 81) # <-- attribute IDs updated here
AND mgv.position = 1;

Related

Do huggingface translation models support separate vocabulary for source and target?

Every example I've looked at so far seems to use a shared vocabulary between source and target languages, and I'm wondering if that is a hard-coded constraint of the Huggingface models, or my misunderstanding, or I've just not looked in the right place yet?
To take a random example, when I look at the files here, https://huggingface.co/Helsinki-NLP/opus-mt-en-zls/tree/main, I see separate "spm" (sentience piece model) files for source and target languages, and they are of different sizes (792kb vs. 850kb). But there is only a single "vocab.json" file. And the config.json file only mentions a single "vocab_size": 57680.
I've also been experimenting, e.g. tokenizer(inputs, text_target=inputs, return_tensors="pt"). If source and target used different vocabulary I would expect the returned input_ids and labels to use different numbers. But every model I've tried so far the numbers are identical (NO, my mistake - see update below).
Can a Huggingface tokenizer even support two vocabularies? If not then a model would need two tokenizers, which seems to clash with the way AutoTokenizer works.
UPDATE
Here is a test script to show the above model is actually using two spm vocabs with AutoTokenizer.
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_name = 'Helsinki-NLP/opus-mt-en-zls'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
inputs = ['Filter all items from same host']
targets = ['Filtriraj sve stavke s istog hosta']
x=tokenizer(inputs, text_target=targets, return_tensors="pt")
print(x)
print(tokenizer.decode(x['input_ids'][0]))
print(tokenizer.decode(x['labels'][0]))
print("\nGiving inputs on both sides")
x=tokenizer(inputs, text_target=inputs, return_tensors="pt")
print(x) ## Expecting to see different numbers if they use different vocabs
print(tokenizer.decode(x['input_ids'][0]))
print(tokenizer.decode(x['labels'][0]))
print("\nGiving targets on both sides")
x=tokenizer(targets, text_target=targets, return_tensors="pt") ## Expecting to see different numbers if they use different vocabs
print(x)
print(tokenizer.decode(x['input_ids'][0]))
print(tokenizer.decode(x['labels'][0]))
print(model)
The output is:
{'input_ids': tensor([[10373, 90, 8255, 98, 605, 6276, 0]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1]]), 'labels': tensor([[11638, 1392, 7636, 386, 35861, 95, 2130, 218, 6276, 27,
0]])}
▁Filter all▁items from same host</s>
Filtriraj sve stavke s istog hosta</s>
Giving inputs on both sides
{'input_ids': tensor([[10373, 90, 8255, 98, 605, 6276, 0]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1]]), 'labels': tensor([[11638, 911, 90, 3188, 7, 98, 605, 6276, 0]])}
▁Filter all▁items from same host</s>
Filter all items from same host</s>
Giving targets on both sides
{'input_ids': tensor([[11638, 1392, 7636, 95, 120, 914, 465, 478, 95, 29,
25, 897, 6276, 27, 0]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]), 'labels': tensor([[11638, 1392, 7636, 386, 35861, 95, 2130, 218, 6276, 27,
0]])}
Filtriraj sve stavke s istog hosta</s>
Filtriraj sve stavke s istog hosta</s>
When I choose identical strings in English or Croatian it gives slightly different numbers, showing that different tokenizers are involved. You can then see that the different ids sometimes map back to an identical string, sometimes not.
But when I print out the model we see it is actually a shared vocabulary, which makes the two spm models a bit pointless.
(encoder): MarianEncoder(
(embed_tokens): Embedding(57680, 512, padding_idx=57679)
...
(decoder): MarianDecoder(
(embed_tokens): Embedding(57680, 512, padding_idx=57679)
...
(lm_head): Linear(in_features=512, out_features=57680, bias=False)
I haven't got as far as finding out if a non-shared vocabulary is possible, but still yet to see evidence of one.
For Marian-based models, HuggingFace now supports separate vocabularies for source and target, but some models may not, especially older models.
(As you know, OPUS-MT models are based on MarianMT. The MarianMT framework supports it.)
Before https://github.com/huggingface/transformers/pull/15831, HuggingFace used a shared vocabulary file for Marian.
This PR updates the Marian model:
To allow not sharing embeddings between encoder and decoder.
Allow tying only decoder embeddings with lm_head.
Separate two vocabs in tokenizer for src and tgt language
...
share_encoder_decoder_embeddings: to indicate if emb should be shared or not
So models trained with earlier versions of the framework, or that parameter set to false, only have one shared vocabulary file for source and target.

Boto3 Amplify list apps

I have a lot of amplify apps which I want to manage via Lambdas. What is the equivalent of the cli command aws amplify list-apps in boto3, I had multiple attempts, but none worked out for me.
My bit of code that was using nextToken looked like this:
amplify = boto3.client('amplify')
apps = amplify.list_apps()
print(apps)
print('First token is: ', apps['nextToken'])
while 'nextToken' in apps:
apps = amplify.list_apps(nextToken=apps['nextToken'])
print('=====NEW APP=====')
print(apps)
print('=================')
Then I tried to use paginators like:
paginator = amplify.get_paginator('list_apps')
response_iterator = paginator.paginate(
PaginationConfig={
'MaxItems': 100,
'PageSize': 100
}
)
for i in response_iterator:
print(i)
Both of the attempts were throwing inconsistent output. The first one was printing first token and second entry but nothing more. The second one gives only the first entry.
Edit with more attemptsinfo + output. Bellow piece of code:
apps = amplify.list_apps()
print(apps)
print('---------------')
new_app = amplify.list_apps(nextToken=apps['nextToken'], maxResults=100)
print(new_app)
print('---------------')```
Returns (some sensitive output bits were removed):
EVG_long_token_x4gbDGaAWGPGOASRtJPSI='}
---------------
{'ResponseMetadata': {'RequestId': 'f6...e9eb', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'application/json', 'content-length': ...}, 'RetryAttempts': 0}, 'apps': [{'appId': 'dym7444jed2kq', 'appArn': 'arn:aws:amplify:us-east-2:763175725735:apps/dym7444jed2kq', 'name': 'vesting-interface', 'tags': {}, 'repository': 'https://github.com/...interface', 'platform': 'WEB', 'createTime': datetime.datetime(2021, 5, 4, 3, 41, 34, 717000, tzinfo=tzlocal()), 'updateTime': datetime.datetime(2021, 5, 4, 3, 41, 34, 717000, tzinfo=tzlocal()), 'environmentVariables': {}, 'defaultDomain': 'dym7444jed2kq.amplifyapp.com', 'customRules': _rules_, 'productionBranch': {'lastDeployTime': datetime.datetime(2021, 5, 26, 15, 10, 7, 694000, tzinfo=tzlocal()), 'status': 'SUCCEED', 'thumbnailUrl': 'https://aws-amplify-', 'branchName': 'main'}, - yarn install\n build:\n commands:\n - yarn run build\n artifacts:\n baseDirectory: build\n files:\n - '**/*'\n cache:\n paths:\n - node_modules/**/*\n", 'customHeaders': '', 'enableAutoBranchCreation': False}]}
---------------
I am very confused, why next iteration doesn't has nextToken and how can I get to the next appId.
import boto3
import json
session=boto3.session.Session(profile_name='<Profile_Name>')
amplify_client=session.client('amplify',region_name='ap-south-1')
output=amplify_client.list_apps()
print(output['apps'])

K-fold cross validation - save folds for different models

I am trying to train my models and validate them using sklearn's cross validation. What I want to do is use the same folds across all of my models (which will be running from different python scripts).
How can I do this? Should I save them to a file? or should I save the kfold model? or should I use the same seed?
kfold = StratifiedKFold(n_splits=n_splits, shuffle=True, random_state=seed)
Well the easiest way I found to save the folds was to simply get them from the stratified k fold split method by looping over it. Then storing it to a json file:
kfold = StratifiedKFold(n_splits=n_splits, shuffle=True, random_state=seed)
folds = {}
count = 1
for train, test in kfold.split(np.zeros(len(y)), y.argmax(1)):
folds['fold_{}'.format(count)] = {}
folds['fold_{}'.format(count)]['train'] = train.tolist()
folds['fold_{}'.format(count)]['test'] = test.tolist()
count += 1
print(len(folds) == n_splits)#assert we have the same number of splits
#dump folds to json
import json
with open('folds.json', 'w') as fp:
json.dump(folds, fp)
Note 1: Argmax here is used because my y values are one hot variables so we need to get the class that is predicted/ground truth.
Now to load it from any other script:
#load to dict to be used
with open('folds.json') as f:
kfolds = json.load(f)
From here we can easily just loop over the elements in the dict:
for key, val in kfolds.items():
print(key)
train = val['train']
test = val['test']
Our json file looks like so:
{"fold_1": {"train": [193, 2405, 2895, 565, 1215, 274, 2839, 1735, 2536, 1196, 40, 2541, 980,...SNIP...830, 1032], "test": [1, 5, 6, 7, 10, 15, 20, 26, 37, 45, 52, 54, 55, 59, 60, 64, 65, 68, 74, 76, 78, 90, 100, 106, 107, 113, 122, 124, 132, 135, 141, 146,...SNIP...]}

Laravel Carbon - Calculate time between database entries

I am trying to calculate de difference between two datetimes.
I have two Models, tasks and task_interactions, they are related like this: tasks has many task_interactions, and task_interactions belongs to tasks.
The user can change the status of a task to one of these: "In progress", "Paused" and Finished. The user change the status several times between "In Progress" and "Paused", but once it's finished, the user cannot change the task status anymore.
All interactions are saved in the task_interactions table, with created_by as user_id and created_at as the datetime. the table has these columns: id ; task_id ; status ; comments ; created_by ; created_at.
Here's na exemple for task id = 1:
# id, task_id, status, comments, created_by, created_at
144, 1, In Progress, , 1, 2017-11-14 09:42:20
145, 1, Paused, , 1, 2017-11-14 09:45:53
146, 1, In Progress, , 1, 2017-11-14 09:46:57
147, 1, Paused, , 1, 2017-11-14 09:49:57
148, 1, In Progress, , 1, 2017-11-14 10:00:10
149, 1, Paused, , 1, 2017-11-14 10:06:25
150, 1, In Progress, , 1, 2017-11-14 10:07:26
151, 1, Paused, , 1, 2017-11-14 10:08:11
153, 1, Paused, , 1, 2017-11-14 11:27:04
154, 1, In Progress, , 1, 2017-11-14 11:27:21
155, 1, Paused, , 1, 2017-11-14 11:57:38
The thing is, for me to have the total time spent in one task, i have to determine the time between In Progress and Paused. The time between Paused and In Progress is not relevant.
I am strugling with this. Is there anyone who can help me?
Thanks in advance.
Just one more thing, i am not very experienced, so pardon if my questions sound a bit dummer, I believe that i should do a #foreach loop through all task_interactions where task_id = X, and thats all i know.
As asked, here's the data i am passing for the view:
public function mytasks()
{
$users = User::all();
$mytasks = Task::where('assigned_to','=', Auth::id())->where('status','!=','Finished')->get();
$mytaskscount = Task::where('assigned_to','=', Auth::id())->count();
$tasksinprogress = Task::where('status','=','In Progress')->where('assigned_to','=', Auth::id())->get();
$tasksinprogresscount = Task::where('status','=','In Progress')->where('assigned_to','=', Auth::id())->count();
$taskspaused = Task::where('status','=','Paused')->where('assigned_to','=', Auth::id())->get();
$taskspausedcount = Task::where('status','=','Paused')->where('assigned_to','=', Auth::id())->count();
$tasksfinished = Task::where('status','=','Finished')->where('assigned_to','=', Auth::id())->get();
$tasksfinishedcount = Task::where('status','=','Finished')->where('assigned_to','=', Auth::id())->count();
$tasktypes = Tasktype::all();
$chapters = Chapter::all();
return view('pages.tasks_my', compact('mytasks','tasktypes','chapters','tasksinprogress','taskspaused','tasksfinished','mytaskscount','tasksinprogresscount','taskspausedcount','tasksfinishedcount'));
}
And one more thing, what i know so far
#foreach ($tasks as $task)
#foreach ($task->interactions as $interactions)
{{ In here, every time my loop returns a status = "In progress" i should mark that time, and if that the next loop returns a "Paused", i should count the difference between these two. And sum all other equal cases in the loop. }}
#endforeach
#endforeach
I realy have no idea on how to do that
I would recommend David's answer. If you are not comfortable doing it using MySQL. You may have to do it like this:
Calculate difference between first time a task was started & when it was paused:
$start = Carbon::parse($startTask->created_at);
$pause = Carbon::parse($endTask->created_at);
$diffInSeconds = $pause->diffInSeconds($start)
This should give you the total difference in seconds as to when a tasks was started & then paused. Now you have to repeat this approach for when again a task was started & then paused.
This is strict Mysql, but should do the trick.
You can call this in laravel directly using raw sql instruction, or you should probably rework this into Laravel code.
select SEC_TO_TIME(
sum(
timediff(
if(task_end.created_at is null,
NOW(),
task_end.created_at),
task_start.created_at)))
from task_interactions task_start
left join task_interactions task_end
ON task_end.task_id = task_start.task_id
AND task_end.status IN ('Pause', 'Finished')
AND task_end.created_at = (
SELECT min(created_at)
FROM task_interactions
WHERE status = task_end.status
AND created_at >= task_start.created_at
)
WHERE task_start.task_id = 1
AND task_start.status = 'In Progress'

use for loop to call multiple functions in lua

I want to call multiple methods in lua that are very similar except their parameters change by one character. The way I'm doing it now works but is extremely in efficient.
function scene:createScene(event)
screenGroup = self.view
level1= display.newRoundedRect( 50, 110, 50, 50, 5 )
level1:setFillColor( 100,0,200 )
level2= display.newRoundedRect( 105, 110, 50, 50, 5 )
level2:setFillColor (100,200,0)
--and so on so forth
screenGroup:insert (level1)
screenGroup:insert (level2)
screenGroup:insert (level3)
screenGroup:insert (level4)
end
I plan on extending the screenGroop:insert method to hundreds of levels, maybe up to (level300). As you can see the way I'm doing it now is inefficient. I tried doing
for i=1, 4, 1 do
screenGroup:insert(level..i)
end
but I get the error "table expected."
The best way in this case is to probably use a table:
local levels = {}
levels[1] = display.newRoundedRect( 50, 110, 50, 50, 5 )
levels[1]:setFillColor( 100,0,200 )
levels[2] = display.newRoundedRect( 105, 110, 50, 50, 5 )
levels[2]:setFillColor (100,200,0)
--and so on so forth
for _, level in ipairs(levels) do
screenGroup:insert(level)
end
For other alternatives check the SO answer from #EtanReisner's comment.
If your 'level' tables are global, which is appears they are, you can use getfenv to index them.
for i = 1, number_of_levels do
screenGroup:insert(getfenv()["level" .. i])
end
getfenv returns the environment, with all global variables, in the form of a dictionary. Therefore, you can index it like a normal table like getfenv()["key"]

Resources