Is there a better/faster way to insert data to a database from an external api in Laravel?

Is there a better/faster way to insert data to a database from an external api in Laravel? - laravel

I am currently getting data from an external API for use my in Laravel API. I have everything working but I feel like it is slow.
I'm getting the data from the API with Http:get('url) and that works fast. It is only when I start looping through the data and making edits when things are slowing down.
I don't need all the data, but it would still be nice to edit before entering the data to the database as things aren't very consitent if possible. I also have a few columns that use data and some logic to make new columns so that each app/site doesn't need to do it.
I am saving to the database on each foreach loop with the eloquent Model:updateOrCreate() method which works but these json files can easily be 6000 lines long or more so it obviously takes time to loop through each set modify values and then save to the database each time. There usually isn't more than 200 or so entries but it still takes time. Will probably eventually update this to the new upset() method to make less queries to the database. Running in my localhost it is currently take about a minute and a half to run, which just seams way too long.
Here is a shortened version of how I was looping through the data.
$json = json_decode($contents, true);
$features = $json['features'];
foreach ($features as $feature){
// Get ID
$id = $feature['id'];
// Get primary condition data
$geometry = $feature['geometry'];
$properties = $feature['properties'];
// Get secondary geometry data
$geometryType = $geometry['type'];
$coordinates = $geometry['coordinates'];
Model::updateOrCreate(
[
'id' => $id,
],
[
'coordinates' => $coordinates,
'geometry_type' => $geometryType,
]);
}
Most of what I'm doing behind the scenes to the data before going into the database is cleaning up some text strings but there are a few logic things to normalize or prep the data for websites and apps.
Is there a more efficient way to get the same result? This will ultimately be used in a scheduler and run on an interval.
Example Data structure from API documentation
{
"$schema": "http://json-schema.org/draft-04/schema#",
"additionalProperties": false,
"properties": {
"features": {
"items": {
"additionalProperties": false,
"properties": {
"attributes": {
"type": [
"object",
"null"
]
},
"geometry": {
"additionalProperties": false,
"properties": {
"coordinates": {
"items": {
"items": {
"type": "number"
},
"type": "array"
},
"type": "array"
},
"type": {
"type": "string"
}
},
"required": [
"coordinates",
"type"
],
"type": "object"
},
"properties": {
"additionalProperties": false,
"properties": {
"currentConditions": {
"items": {
"properties": {
"additionalData": {
"type": "string"
},
"conditionDescription": {
"type": "string"
},
"conditionId": {
"type": "integer"
},
"confirmationTime": {
"type": "integer"
},
"confirmationUserName": {
"type": "string"
},
"endTime": {
"type": "integer"
},
"id": {
"type": "integer"
},
"sourceType": {
"type": "string"
},
"startTime": {
"type": "integer"
},
"updateTime": {
"type": "integer"
}
},
"required": [
"id",
"userName",
"updateTime",
"startTime",
"conditionId",
"conditionDescription",
"confirmationUserName",
"confirmationTime",
"sourceType",
"endTime"
],
"type": "object"
},
"type": "array"
},
"id": {
"type": "string"
},
"name": {
"type": "string"
},
"nameId": {
"type": "string"
},
"parentAreaId": {
"type": "integer"
},
"parentSubAreaId": {
"type": "integer"
},
"primaryLatitude": {
"type": "number"
},
"primaryLongitude": {
"type": "number"
},
"primaryMP": {
"type": "number"
},
"routeId": {
"type": "integer"
},
"routeName": {
"type": "string"
},
"routeSegmentIndex": {
"type": "integer"
},
"secondaryLatitude": {
"type": "number"
},
"secondaryLongitude": {
"type": "number"
},
"secondaryMP": {
"type": "number"
},
"sortOrder": {
"type": "integer"
}
},
"required": [
"id",
"name",
"nameId",
"routeId",
"routeName",
"primaryMP",
"secondaryMP",
"primaryLatitude",
"primaryLongitude",
"secondaryLatitude",
"secondaryLongitude",
"sortOrder",
"parentAreaId",
"parentSubAreaId",
"routeSegmentIndex",
"currentConditions"
],
"type": "object"
},
"type": {
"type": "string"
}
},
"required": [
"type",
"geometry",
"properties",
"attributes"
],
"type": "object"
},
"type": "array"
},
"type": {
"type": "string"
}
},
"required": [
"type",
"features"
],
"type": "object"
}
Second, related question.
Since this is being updated on an interval I have it updating and creating records from the json data, but is there an efficient way to delete old records that are no longer in the json file? I currently get an array of current ids and compare them to the new ids and then loop through each and delete them. There has to be a better way.

Have no idea what to say to your first question, but I think you may try to do something like this regarding the second question.
SomeModel::query()->whereNotIn('id', $newIds)->delete();
$newIds you can collect during the first loop.

Related

Validating yaml file by json scheme in monaco editor gives incorrect error

monaco.languages.json.jsonDefaults.setDiagnosticsOptions({
validate: true,
schemas: \[
{
"$schema": "http://json-schema.org/draft-06/schema#",
"$ref": "#/definitions/Welcome1",
"definitions": {
"Welcome1": {
"type": "object",
"additionalProperties": false,
"properties": {
"inventory": {
"type": "array",
"items": {
"$ref": "#/definitions/Inventory"
}
}
},
"required": \[
"inventory"
\],
"title": "Welcome1"
},
"Inventory": {
"type": "object",
"additionalProperties": false,
"properties": {
"devices": {
"$ref": "#/definitions/Devices"
},
"pollfrequency": {
"type": "integer"
},
"scopedinventoryobject": {
"type": "string"
}
},
"required": \[
"devices",
"pollfrequency",
"scopedinventoryobject"
\],
"title": "Inventory"
},
"Devices": {
"type": "object",
"additionalProperties": false,
"properties": {
"platformtypes": {
"type": "array",
"items": {
"type": "string"
}
}
},
"required": \[
"platformtypes"
\],
"title": "Devices"
}
}
}
\]
});
Providing a screenshot for incorrect validation that i have. I haven't found a guide to use a monaco-yaml , so I am just trying to validate by json schema that i have for the file. Is it the right way for yaml validation in monaco editor? Or should i implement different approach right here?

I am facing the exact same problem. From what I understand when we set the schema to: monaco.languages.json.jsonDefaults.setDiagnosticsOptions
The schema is only referenced to the json language declared in model.
So, you are even defining the schema, but you are using a different language in the model. (Even so, the print error seems to be referenced to json)
I'm trying to use the package monaco-yaml because it seems that it already implements the validation

Add images via Shopware 6 API

I have a Shopware 6.3 shop and need to migrate images to it using the integration API.
How should I construct a body for a media upload? Do I need to put a file somewhere or just pass in the link?
I have managed to push new products into Shopware via guide here: https://docs.shopware.com/en/shopware-platform-dev-en/admin-api-guide/writing-entities?category=shopware-platform-dev-en/admin-api-guide#creating-entities but I am not sure how to handle media. In this guide it is only explained how to create links between already uploaded media files to products in here https://docs.shopware.com/en/shopware-platform-dev-en/admin-api-guide/writing-entities?category=shopware-platform-dev-en/admin-api-guide#media-handling but no examples as to how to actually push the media files.
I have URL's for each image I need (in the database, along with produc id's and image positions).
The entity schema describes media as:
"media": {
"name": "media",
"translatable": [
"alt",
"title",
"customFields"
],
"properties": {
"id": {
"type": "string",
"format": "uuid"
},
"userId": {
"type": "string",
"format": "uuid"
},
"mediaFolderId": {
"type": "string",
"format": "uuid"
},
"mimeType": {
"type": "string",
"readOnly": true
},
"fileExtension": {
"type": "string",
"readOnly": true
},
"uploadedAt": {
"type": "string",
"format": "date-time",
"readOnly": true
},
"fileName": {
"type": "string",
"readOnly": true
},
"fileSize": {
"type": "integer",
"format": "int64",
"readOnly": true
},
"metaData": {
"type": "object",
"readOnly": true
},
"mediaType": {
"type": "object",
"readOnly": true
},
"alt": {
"type": "string"
},
"title": {
"type": "string"
},
"url": {
"type": "string"
},
"hasFile": {
"type": "boolean"
},
"private": {
"type": "boolean"
},
"customFields": {
"type": "object"
},
"createdAt": {
"type": "string",
"format": "date-time",
"readOnly": true
},
"updatedAt": {
"type": "string",
"format": "date-time",
"readOnly": true
},
"translated": {
"type": "object"
},
"tags": {
"type": "array",
"entity": "tag"
},
"thumbnails": {
"type": "array",
"entity": "media_thumbnail"
},
"user": {
"type": "object",
"entity": "user"
},
"categories": {
"type": "array",
"entity": "category"
},
"productManufacturers": {
"type": "array",
"entity": "product_manufacturer"
},
"productMedia": {
"type": "array",
"entity": "product_media"
},
"avatarUser": {
"type": "object",
"entity": "user"
},
"mediaFolder": {
"type": "object",
"entity": "media_folder"
},
"propertyGroupOptions": {
"type": "array",
"entity": "property_group_option"
},
"mailTemplateMedia": {
"type": "array",
"entity": "mail_template_media"
},
"documentBaseConfigs": {
"type": "array",
"entity": "document_base_config"
},
"shippingMethods": {
"type": "array",
"entity": "shipping_method"
},
"paymentMethods": {
"type": "array",
"entity": "payment_method"
},
"productConfiguratorSettings": {
"type": "array",
"entity": "product_configurator_setting"
},
"orderLineItems": {
"type": "array",
"entity": "order_line_item"
},
"cmsBlocks": {
"type": "array",
"entity": "cms_block"
},
"cmsSections": {
"type": "array",
"entity": "cms_section"
},
"cmsPages": {
"type": "array",
"entity": "cms_page"
},
"documents": {
"type": "array",
"entity": "document"
}
}
},
but it is not clear what fields are crucial. Do I need to create product-media folder first and then use it's id when making a POST request to media endpoint? Can I just specify the URL and will Shopware download the image itself to a folder or keep pointing to the URL I have used. I need to house the images inside the Shopware.
There is no problem for me to download the images from the URL and push them to Shopware but I am not sure how to use the API for it (there is a lot of images and they need to be done in bulk).

One possible solution:
FIRST: create a new media POST /api/{apiVersion}/media?_response=true
SECOND: "Upload Image" /api/{apiVersion}/_action/media/{mediaId}/upload?extension={extension}&fileName={imgName}&_response=true
more information can be found here: https://forum.shopware.com/discussion/comment/278603/#Comment_278603
In CASE images are for products use the endpoint POST /api/{apiVersion}/product-media and set the coverId
A complete listing of all routes is available via the OpenAPI schema: [your-domain/localhost]/api/v3/_info/openapi3.json
It's also possible to set all the media and the cover & coverId during product creation by one request. Therefore, set the product Cover and product Media
{
"coverId":"3d5ebde8c31243aea9ecebb1cbf7ef7b",
"productNumber":"SW10002","active":true,"name":"Test",
"description":"fasdf",
"media":[{
"productId":"94786d894e864783b546fbf7c60a3640",
"mediaId":"084f6aa36b074130912f476da1770504",
"position":0,
"id":"3d5ebde8c31243aea9ecebb1cbf7ef7b"
},
{
"productId":"94786d894e864783b546fbf7c60a3640",
"mediaId":"4923a2e38a544dc5a7ff3e26a37ab2ae",
"position":1,
"id":"600999c4df8b40a5bead55b75efe688c"
}],
"id":"94786d894e864783b546fbf7c60a3640"
}
Keep in mind to check if the bearer token is valid by checking for example like this:
if (JwtToken.ValidTo >= DateTime.Now.ToUniversalTime() - new TimeSpan(0, 5, 0))
{
return Client.Get(request);
}
else
{
// refresh the token by new authentication
IntegrationAuthenticator(this.key, this.secret);
}
return Client.Get(request);

This will work for Shopware 6.4
As a general advice, it depends. The APIs changed a little bit since 6.4 and there is also an official documentation available at https://shopware.stoplight.io/docs/admin-api/docs/guides/media-handling.md.
However, i think that it is always a little easier to have a real life example. What i do in our production environment is basically these steps.
(Optional) Check, if the media object exists
Create an media-file object using the endpoint GET /media-files/
If it exist then upload an image using the new media-id reference.
Let us assume the filename is yourfilename.jpg. What you also will need is a media-folder-id, which will reference the image-folder within Shopware. This can be obtained in Shopware via Admin > Content > Media > Product Media.
Step 0
Before uploading an image to Shopware, you want to ensure that the image does not exists, so that you can skip it.
This step is optional, as it is not mandatory to create an image. However you want to have some sort of validation mechanism in a production environment.
Request-Body
POST api/search/media
This will run a request against the Shopware-API with a response.
{
"filter":[
{
"type":"equals",
"field":"fileName",
"value":"yourfilename"
},
{
"type":"equals",
"field":"fileExtension",
"value":"jpg"
},
{
"type":"equals",
"field":"mediaFolderId",
"value":"d798f70b69f047c68810c45744b43d6f"
}
],
"includes":{
"media":[
"id"
]
}
}
Step 1
Create a new media-file
Request-Body
POST api/_action/sync
This request will create a new media-object in Shopware.
The value for media_id must be any UUID. I will use this value: 94f83a75669647288d4258f670a53e69
The customFields property is optional. I just use it to keep a reference of hash value which i could use to validate changed values.
The value for the media folder id is the one you will get from your Shopware-Backend.
{
"create-media": {
"entity": "media",
"action": "upsert",
"payload": [
{
"id": "{{media_id}}",
"customFields": {"hash": "{{file.hash}}"},
"mediaFolderId": "{{mediaFolderId}}"
}
]
}
}
Response
The response will tell you that everything works as expected.
{
"success":true,
"data":{
"create-media":{
"result":[
{
"entities":{
"media":[
"94f83a75669647288d4258f670a53e69"
],
"media_translation":[
{
"mediaId":"94f83a75669647288d4258f670a53e69",
"languageId":"2fbb5fe2e29a4d70aa5854ce7ce3e20b"
}
]
},
"errors":[
]
}
],
"extensions":[
]
}
},
"extensions":[
]
}
Step 2
This is the step where we will upload an image to Shopware. We will use a variant with the content-type image/jpg. However, a payload with an URL-Attribute would also work. See the details in the official documentation.
Request-Body
POST api/_action/media/94f83a75669647288d4258f670a53e69/upload?extension=jpg&fileName=yourfilename
Note that the media-id is part of the URL. And also the filename but without the file-extension JPG!
This body is pretty straightforward an in our case there is no payload, as we use an upload with Content-Type: "image/jpeg".
This would be a payload if you want to use an URL as resource:
{
"url": "<url-to-your-image>"
}

Nifi JoltTransformRecord UUID in default transform not working as expected

I have a Nifi workflow which uses JoltTranformRecord for doing some manipulation in the data which is record based. I have to create a default value uuid in each message in flow file.
My JoltTranformRecord configuration is as below.
Jolt specification :
[{
"operation": "shift",
"spec": {
"payload": "data.payload"
}
}, {
"operation": "default",
"spec": {
"header": {
"source": "${source}",
"client_id": "${client_id}",
"uuid": "${UUID()}",
"payload_type":"${payload_type}"
}
}
}]
Shift operation and all other default operations are working fine as expected. But UUID is coming same for all the messages. I need different UUIDs for each messages. I don't want to add another processor for this purpose only.
My workflow below :
Reader & Writer configurations for JoltRecord processor is :
IngestionSchemaJsonTreeReader ( From JsonTreeReader Processor ):
IngestionSchemaAvroRecordSetWriter ( From AvroWriter Processor ) :
Configured schema registry has below schemas defined in it.
com.xyz.ingestion.pre_json
{
"type": "record",
"name": "event",
"namespace": "com.xyz.ingestion.raw",
"doc": "Event ingested to kafka",
"fields": [
{
"name": "payload",
"type": [
"null",
"string"
],
"default": "null"
}
]
}
com.xyz.ingestion.raw -
{
"type": "record",
"name": "event",
"namespace": "com.xyz.ingestion.raw",
"doc": "Event ingested to kafka",
"fields": [
{
"type": {
"name": "header",
"type": "record",
"namespace": "com.xyz.ingestion.raw.header",
"doc": "Header data for event ingested",
"fields": [
{
"name": "payload_type",
"type": "string"
},
{
"name": "uuid",
"type": "string",
"size": "36"
},
{
"name": "client_id",
"type": "string"
},
{
"name": "source",
"type": "string"
}
]
},
"name": "header"
},
{
"type": {
"name": "data",
"type": "record",
"namespace": "com.xyz.ingestion.raw.data",
"doc": "Payload for event ingested",
"fields": [
{
"name": "payload",
"type": [
"null",
"string"
],
"default": "null"
}
]
},
"name": "data"
}
]
}

The expression language is evaluated per record. UUID() is executed for each evaluation. So uuid must be unique for each record. From the information you provided I cannot see why you are getting duplicate uuids.
I tried to reproduce your problem with following flow:
GenerateFlowFile:
SplitJson: configure $ as JsonPathExpression to split Json array into records.
JoltTransformRecord:
As you can see the way I am adding the UUID is not different from how you do it. But I am getting different UUIDs as expected:

Lucene search using Kibana does return my results

Using Kibana, I have created the following index:
put newsindex
{
"settings" : {
"number_of_shards":3,
"number_of_replicas":2
},
"mappings" : {
"news": {
"properties": {
"NewsID": {
"type": "integer"
},
"NewsType": {
"type": "text"
},
"BodyText": {
"type": "text"
},
"Caption": {
"type": "text"
},
"HeadLine": {
"type": "text"
},
"Approved": {
"type": "text"
},
"Author": {
"type": "text"
},
"Contact": {
"type": "text"
},
"DateCreated": {
"type": "date",
"format": "date_time"
},
"DateSubmitted": {
"type": "date",
"format": "date_time"
},
"LastModifiedDate": {
"type": "date",
"format": "date_time"
}
}
}
}
}
I have populated the index with Logstash. If I just perform a match_all query, all my records are returned as you'd expect. However, when I try to perform a targeted query such as:
get newsindex/_search
{
"query":{"match": {"headline": "construct abnomolies"}
}
}
I can see headline as a property of _source, but my query is ignored i.e. I still receive everything, regardless of whats in the headline. How do I need to change my index to make headline searchable. I'm using Elasticsearch 5.6.3

I needed to change the name property on my index to be lowercase. I noticed in the output windows the the properties under _source where lowercase. In Kibana the predictive text was offering my notation and lowercase. I've dropped my index and re-populated and it now works.

Querying inner objects by field name in Elasticsearch

I'm reading the documentation on this site: https://www.elastic.co/guide/en/elasticsearch/guide/current/complex-core-fields.html#_how_inner_objects_are_indexed
I'm interested in this paragraph:
Inner fields can be referred to by name (for example, first). To distinguish between two fields that have the same name, we can use the full path (for example, user.name.first) or even the type name plus the path (tweet.user.name.first).
so if I have the example from the linked docu site:
{
"gb": {
"tweet": {
"properties": {
"tweet": { "type": "string" },
"user": {
"type": "object",
"properties": {
"id": { "type": "string" },
"gender": { "type": "string" },
"age": { "type": "long" },
"name": {
"type": "object",
"properties": {
"full": { "type": "string" },
"first": { "type": "string" },
"last": { "type": "string" }
}
}
}
}
}
}
}
}
According to the docu I should be able to search with condition last: whatever, but it does not work. I always have to use the full path user.last: whatever. Is the documentation false or my understanding of it? Note that last occurs only in the inner object, so in theory full path should not be necessary to reference it.
edit:
query that works:
get /my_index/_search?q=user.name.last:test
query that does not work but should according to documentation:
get /my_index/_search?q=last:test

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Is there a better/faster way to insert data to a database from an external api in Laravel? - laravel

Have no idea what to say to your first question, but I think you may try to do something like this regarding the second question. SomeModel::query()->whereNotIn('id', $newIds)->delete(); $newIds you can collect during the first loop.

Related

Validating yaml file by json scheme in monaco editor gives incorrect error

Add images via Shopware 6 API

Nifi JoltTransformRecord UUID in default transform not working as expected

Lucene search using Kibana does return my results

Querying inner objects by field name in Elasticsearch

Categories

Resources