Here, I'm using mongodb driver for ruby. But after this will work perfect I want to run it as a scheduled task in Ruby on Rails 3 with Mongoid ODB.
So for now, I'm experimenting in ruby.
I've noticed crack gem is very convenient when it comes to convert XML file into the format that can be inserted into mongodb. When I use mongodb driver for ruby, crack converts to the format close to JSON (it's using "=>" instead of ":" columns), which is required condition before I will insert it into mondodb database as shown here.
The problem the way I'm using crack below it imports everything that is in XML file.
Please see below.
sample.xml
<?xml version="1.0" encoding="utf-8"?>
<ShipmentRequest>
<Envelope>
<TransmissionDateTime>05/08/2013 23:06:02</TransmissionDateTime>
</Envelope>
<Message>
<Comment />
<Header>
<MemberId>A00000001</MemberId>
<MemberName>Bruce</MemberName>
<DeliveryId>6377935</DeliveryId>
<ShipToAddress1>123-4567</ShipToAddress1>
<OrderDate>05/08/13</OrderDate>
<Payments>
<PayType>Credit Card</PayType>
<Amount>1000</Amount>
</Payments>
<Payments>
<PayType>Points</PayType>
<Amount>5390</Amount>
</Payments>
</Header>
<Line>
<LineNumber>3.1</LineNumber>
<Item>fruit-004</Item>
<Description>Peach</Description>
<Quantity>1</Quantity>
<UnitCost>1610</UnitCost>
<DeclaredValue>0</DeclaredValue>
<PointValue>13</PointValue>
</Line>
<Line>
<LineNumber>8.1</LineNumber>
<Item>fruit-001</Item>
<Description>Fruit Set</Description>
<Quantity>1</Quantity>
<UnitCost>23550</UnitCost>
<PointValue>105</PointValue>
<PickLine>
<PickLineNumber>8.1..1</PickLineNumber>
<PickItem>fruit-002</PickItem>
<PickDescription>Apple</PickDescription>
<PickQuantity>1</PickQuantity>
</PickLine>
<PickLine>
<PickLineNumber>8.1..2</PickLineNumber>
<PickItem>fruit-003</PickItem>
<PickDescription>Orange</PickDescription>
<PickQuantity>2</PickQuantity>
</PickLine>
</Line>
</Message>
</ShipmentRequest>
sample_crack.rb
#!/usr/bin/ruby
require "crack"
require 'mongo'
include Mongo
mongo_client = MongoClient.new("localhost", 27017)
db = mongo_client.db("somedb")
coll = db.collection("somecoll")
myXML = Crack::XML.parse(File.read("sample.xml"))
coll.insert(myXML)
puts myXML
It prints on console:
{"ShipmentRequest"=>{"Envelope"=>{"TransmissionDateTime"=>"05/08/2013 23:06:02"}, "Message"=>{"Comment"=>nil, "Header"=>{"MemberId"=>"A00000001", "MemberName"=>"Bruce", "DeliveryId"=>"6377935", "ShipToAddress1"=>"123-4567", "OrderDate"=>"05/08/13", "Payments"=>[{"PayType"=>"Credit Card", "Amount"=>"1000"}, {"PayType"=>"Points", "Amount"=>"5390"}]}, "Line"=>[{"LineNumber"=>"3.1", "Item"=>"fruit-004", "Description"=>"Peach", "Quantity"=>"1", "UnitCost"=>"1610", "DeclaredValue"=>"0", "PointValue"=>"13"}, {"LineNumber"=>"8.1", "Item"=>"fruit-001", "Description"=>"Fruit Set", "Quantity"=>"1", "UnitCost"=>"23550", "PointValue"=>"105", "PickLine"=>[{"PickLineNumber"=>"8.1..1", "PickItem"=>"fruit-002", "PickDescription"=>"Apple", "PickQuantity"=>"1"}, {"PickLineNumber"=>"8.1..2", "PickItem"=>"fruit-003", "PickDescription"=>"Orange", "PickQuantity"=>"2"}]}]}}, :_id=>BSON::ObjectId('51ad8d83a3d24b3b9f000001')}
In the mongodb, the converted XML file looks like:
{
"_id" : ObjectId("51ad8d83a3d24b3b9f000001"),
"ShipmentRequest" : {
"Envelope" : {
"TransmissionDateTime" : "05/08/2013 23:06:02"
},
"Message" : {
"Comment" : null,
"Header" : {
"MemberId" : "A00000001",
"MemberName" : "Bruce",
"DeliveryId" : "6377935",
"ShipToAddress1" : "123-4567",
"OrderDate" : "05/08/13",
"Payments" : [
{
"PayType" : "Credit Card",
"Amount" : "1000"
},
{
"PayType" : "Points",
"Amount" : "5390"
}
]
},
"Line" : [
{
"LineNumber" : "3.1",
"Item" : "fruit-004",
"Description" : "Peach",
"Quantity" : "1",
"UnitCost" : "1610",
"DeclaredValue" : "0",
"PointValue" : "13"
},
{
"LineNumber" : "8.1",
"Item" : "fruit-001",
"Description" : "Fruit Set",
"Quantity" : "1",
"UnitCost" : "23550",
"PointValue" : "105",
"PickLine" : [
{
"PickLineNumber" : "8.1..1",
"PickItem" : "fruit-002",
"PickDescription" : "Apple",
"PickQuantity" : "1"
},
{
"PickLineNumber" : "8.1..2",
"PickItem" : "fruit-003",
"PickDescription" : "Orange",
"PickQuantity" : "2"
}
]
}
]
}
}
}
But I'd like to import it like to eliminate not-needed nodes and ignore empty ones:
{
"_id" : ObjectId("51ad8d83a3d24b3b9f000001"),
"MemberId" : "A00000001",
"MemberName" : "Bruce",
"DeliveryId" : "6377935",
"ShipToAddress1" : "123-4567",
"OrderDate" : "05/08/13",
"Payments" : [
{
"PayType" : "Credit Card",
"Amount" : "1000"
},
{
"PayType" : "Points",
"Amount" : "5390"
}
],
"Line" : [
{
"LineNumber" : "3.1",
"Item" : "fruit-004",
"Description" : "Peach",
"Quantity" : "1",
"UnitCost" : "1610",
"DeclaredValue" : "0",
"PointValue" : "13"
},
{
"LineNumber" : "8.1",
"Item" : "fruit-001",
"Description" : "Fruit Set",
"Quantity" : "1",
"UnitCost" : "23550",
"PointValue" : "105",
"PickLine" : [
{
"PickLineNumber" : "8.1..1",
"PickItem" : "fruit-002",
"PickDescription" : "Apple",
"PickQuantity" : "1"
},
{
"PickLineNumber" : "8.1..2",
"PickItem" : "fruit-003",
"PickDescription" : "Orange",
"PickQuantity" : "2"
}
]
}
]
}
Can this be done with crack? Or this can be better done with nokogiri?
Update
Big thanks to #Alex Peachey, here I put the updated code.
sample_crack/rb (updated):
#!/usr/bin/ruby
require "crack"
require 'mongo'
include Mongo
mongo_client = MongoClient.new("localhost", 27017)
db = mongo_client.db("somedb")
coll = db.collection("somecoll")
myXML = Crack::XML.parse(File.read("sample.xml"))
myXML.merge!(myXML.delete("ShipmentRequest")) # not needed hash
myXML.merge!(myXML.delete("Message")) # not needed hash
myXML.merge!(myXML.delete("Header")) # not needed hash
myXML.delete("Envelope") # not needed hash
# planning to put here a code to remove hashes with empty values
coll.insert(myXML)
puts myXML
It's hard to say how you define "not-needed" nodes but empty ones are easy enough to understand. Either way though, Crack is very good at what's it's doing for you which is basically turning the XML into a Hash. Once you have the Hash just prune it as you wish based on whatever rules you have before you insert it into Mongo.
Based on your comment, I better understand what you are asking. My answer still holds true, just manipulate the hash. Specifically you could do this:
myXML.merge!(myXML.delete("ShipmentRequest"))
myXML.delete("Envelope")
myXML.merge!(myXML.delete("Message"))
Related
I am using ElasticSearch to index data and wanted to export few fields from index created every day to Google cloud storage, How to get fields from array of objects in elastic search index and send them as csv file to GCS bucket using Logstash
Tried below conf to fetch nested fields from index:
input {
elasticsearch {
hosts => "host:443"
user => "user"
ssl => true
connect_timeout_seconds => 600
request_timeout_seconds => 600
password => "pwd"
ca_file => "ca.crt"
index => "test"
query => '
{
"_source": ["obj1.Name","obj1.addr","obj1.obj2.location", "Hierarchy.categoryUrl"],
"query": {
"match_all": {}
}
}
'
}
}
filter {
mutate {
rename => {
"[obj1][Name]" => "col1"
"[obj1][addr]" => "col2"
"[obj1][obj2][location]" => "col3"
"[Hierarchy][0][categoryUrl]" => "col4"
}
}
}
output {
google_cloud_storage {
codec => csv {
include_headers => true
columns => [ "col1", "col2","col3"]
}
bucket => "bucket"
json_key_file => "creds.json"
temp_directory => "/tmp"
log_file_prefix => "log_gcs"
max_file_size_kbytes => 1024
date_pattern => "%Y-%m-%dT%H:00"
flush_interval_secs => 600
gzip => false
uploader_interval_secs => 600
include_uuid => true
include_hostname => true
}
}
How to get field populated to above csv from array of objects, in below example wanted to fetch categoryUrl from the first object of an array and populate to csv table and send it to GCS Bucket:
have tried below approaches :
"_source": ["obj1.Name","obj1.addr","obj1.obj2.location", "Hierarchy.categoryUrl"]
and
"_source": ["obj1.Name","obj1.addr","obj1.obj2.location", "Hierarchy[0].categoryUrl"]
with
mutate {
rename => {
"[obj1][Name]" => "col1"
"[obj1][addr]" => "col2"
"[obj1][obj2][location]" => "col3"
"[Hierarchy][0][categoryUrl]" => "col4"
}
for input sample :
"Hierarchy" : [
{
"level" : "1",
"category" : "test",
"categoryUrl" : "testurl1"
},
{
"level" : "2",
"category" : "test2",
"categoryUrl" : "testurl2"
}}
Attaching sample document where I am trying to fetch merchandisingHierarchy[0].categoryUrl and pricingInfo[0].basePrice :
{
"_index" : "amulya-test",
"_type" : "_doc",
"_id" : "ldZPJoYBFi8LOEDK_M2f",
"_score" : 1.0,
"_ignored" : [
"itemDetails.description.keyword"
],
"_source" : {
"itemDetails" : {
"compSku" : "202726",
"compName" : "abc.com",
"compWebsite" : "abc.com",
"title" : "Monteray 38.25 in. x 73.375 in. Frameless Hinged Corner Shower Enclosure in Brushed Nickel",
"description" : "Create the modthroom of your dreams with the clean lines of the VIGO Monteray Frameless Shower Enclosure. Solid 3/8 in. tempered glass combined with stainless steel and solid brass construction makes this enclosure strong and long-lasting. The sleek, reversible, outward-opening door features a convenient towel bar. This versatile enclosure can be installed on a tile floor or with a VIGO Shower Base. With a single water deflector along the bottom seal strip, water is redirected back into the shower to keep your bathroom dry, clean, and pristine.",
"modelNumber" : "VG6011BNCL40",
"upc" : "8137756684",
"hasVariations" : false,
"productDetailsBulletPoints" : [ ],
"itemUrls" : {
"productPageUrl" : "https://.abc.com/p/VIGO-Monteray-38-in-x-73-375-in-Frameless-Hinged-Corner-Shower-Enclosure-in-Brushed-Nickel-VG6011BNCL40/202722616",
"primaryImageUrl" : "https://images.thdstatic.com/productImages/d77d9e8b-1ea1-4811-a470-8364c8e47402/svn/vigo-shower-enclosures-vg6011bncl40-64_600.jpg",
"secondaryImageUrls" : [
"https://images.thdstatic.com/productImages/d77d9e8b-1e1-4811-a470-8364c8e47402/svn/vigo-shower-enclosures-vg6011bncl40-64_1000.jpg",
"https://images.thdstatic.com/productImages/db539ff9-6df-48c2-897a-18dd1e1794e3/svn/vigo-shower-enclosures-vg6011bncl40-e1_1000.jpg",
"https://images.thdstatic.com/productImages/47c5090b-49a-46bc-a36d-921ddae5e1ab/svn/vigo-shower-enclosures-vg6011bncl40-40_1000.jpg",
"https://images.thdstatic.com/productImages/add6691c-a02-466d-9a1a-47200b05685e/svn/vigo-shower-enclosures-vg6011bncl40-a0_1000.jpg",
"https://images.thdstatic.com/productImages/d638230e-0d9-40c9-be93-7f7bf24f0732/svn/vigo-shower-enclosures-vg6011bncl40-1d_1000.jpg"
]
}
},
"merchandisingHierarchy" : [
{
"level" : "1",
"category" : "Home",
"categoryUrl" : "host/"
},
{
"level" : "2",
"category" : "Bath",
"categoryUrl" : "host/b/Bath/N-5yc1vZbzb3"
},
{
"level" : "3",
"category" : "Showers",
"categoryUrl" : "host/b/Bath-Showers/N-5yc1vZbzcd"
},
{
"level" : "4",
"category" : "Shower Doors",
"categoryUrl" : "host/b/Bath-Showers-Shower-Doors/N-5yc1vZbzcg"
},
{
"level" : "5",
"category" : "Shower Enclosures",
"categoryUrl" : "host/b/Bath-Showers-Shower-Doors-Shower-Enclosures/N-5yc1vZcbn2"
}
],
"reviewsAndRatings" : {
"pdtReviewCount" : 105
},
"additionalAttributes" : {
"isAddon" : false
},
"productSpecifications" : {
"Warranties" : { },
"Details" : { },
"Dimensions" : { }
},
"promoDetails" : [
{
"promoName" : "Save $150.00 (15%)",
"promoPrice" : 849.9
}
],
"locationDetails" : { },
"storePickupDetails" : {
"deliveryText" : "Get it by Mon, Feb 20",
"toEddDate" : "Mon, Feb 20",
"isBackordered" : false,
"selectedEddZipcode" : "20147",
"shipToStoreEnabled" : true,
"homeDeliveryEnabled" : true,
"scheduledDeliveryEnabled" : false
},
"recommendedProducts" : [ ],
"pricingInfo" : [
{
"type" : "SAS",
"offerPrice" : 849.9,
"sellerName" : "abc.com",
"onClearance" : false,
"inStock" : true,
"isBuyBoxWinner" : true,
"promo" : [
{
"onPromo" : true,
"promoName" : "Save $150.00 (15%)",
"promoPrice" : 849.9
}
],
"basePrice" : 999.9,
"priceVariants" : [
{
"basePrice" : 999.9,
"offerPrice" : 849.9
}
],
"inventoryDetails" : {
"stockInStore" : false,
"stockOnline" : true
}
}
]
}
}
You can do it like this:
input {
elasticsearch {
...
query => '
{
"_source": ["merchandisingHierarchy.categoryUrl"],
"query": {
"match_all": {}
}
}
'
}
}
filter {
mutate {
add_field => {
"col1" => "%{[merchandisingHierarchy][0][categoryUrl]}"
"col2" => "%{[pricingInfo][0][basePrice]}"
}
}
}
output {
stdout {
codec => csv {
include_headers => true
columns => [ "col1"]
}
}
}
I've tested with your sample document and I get the output below, which looks like is working per your expectation:
col1,col2
host/,999.9
I'm trying to fetch data from idexes with a hibernate search fulltext query.
Below is the index structure:
{
"_index" : "basclt1400",
"_type" : "com.csc.pt.svc.data.to.Basclt1400TO",
"_id" : "00,0006682,CPP,05,00",
"_score" : 1.0,
"_source" : {
"id" : "00,0006682,CPP,05,00",
"location" : "00",
"master0co" : "05",
"policy0num" : "0006682",
"symbol" : "CPP",
"module" : "00",
"cltseqnum" : 281,
"addrseqnum" : "1",
"policies_location" : [
"00",
"00"
],
"policies_master0co" : [
"05",
"05"
],
"policies_policy0num" : [
"0006682",
"0006682"
],
"policies_trans0stat" : [
"V",
"P"
],
"policies_id02" : [
"02",
"02"
],
"policies_symbol" : [
"CPP",
"CPP"
],
"policies_module" : [
"00",
"00"
],
"policies_tot0ag0prm" : [
"1532.00",
"1532.00"
],
"policies_issue0code" : [
"N",
"N"
],
"policies_id" : [
"02,00,0006682,CPP,05,00,V",
"02,00,0006682,CPP,05,00,P"
]
}
This structure may change as per data under the index, at some places the data under "policies_policy0num" field there may be just one record, like below, and it works fine with this structure:
"_index" : "basclt1400",
"_type" : "com.csc.pt.svc.data.to.Basclt1400TO",
"_id" : "00,0012410,CPP,05,00",
"_score" : 1.0,
"_source" : {
"id" : "00,0012410,CPP,05,00",
"location" : "00",
"master0co" : "05",
"policy0num" : "0012410",
"symbol" : "CPP",
"module" : "00",
"cltseqnum" : 281,
"addrseqnum" : "1",
"policies_location" : [
"00"
],
"policies_master0co" : [
"05"
],
"policies_policy0num" : [
"0012410"
],
"policies_trans0stat" : [
"P"
],
"policies_id02" : [
"02"
],
"policies_symbol" : [
"CPP"
],
"policies_module" : [
"00"
],
"policies_tot0ag0prm" : [
"0.00"
],
"policies_issue0code" : [
"N"
],
"policies_id" : [
"02,00,0012410,CPP,05,00,P"
]
}
}
I'm trying to fetch this like below:
Iterator itr = fullTextQuery.getResultList().iterator();
List<MasterSearchPmsp0200DataArr> policyArrayFinal = new ArrayList<MasterSearchPmsp0200DataArr>();
List<MasterSearchPmsp0200DataArr> quoteArrayFinal = new ArrayList<MasterSearchPmsp0200DataArr>();
while(itr.hasNext()){
Object[] obj = (Object[]) itr.next();
char issueCode = (char) obj[5];
if(issueCode == 'N' || issueCode == 'R') {
policyArrayFinal.add( new MasterSearchPmsp0200DataArr((String) obj[0], Long.valueOf(to.getCltseqnum()),
(String) obj[1], (String) obj[2], (String) obj[3], (String) obj[4],
(char) obj[5], (char) obj[6]));
}else {
quoteArrayFinal.add( new MasterSearchPmsp0200DataArr((String) obj[0], Long.valueOf(to.getCltseqnum()),
(String) obj[1], (String) obj[2], (String) obj[3], (String) obj[4],
(char) obj[5], (char) obj[6]));
}
}
and it's throwing the below error, just for the records where we have multiple data under policies_policy0num.
java.lang.IllegalStateException
at com.google.gson.JsonArray.getAsString(JsonArray.java:226)
at org.hibernate.search.elasticsearch.query.impl.PrimitiveProjection.addDocumentField(PrimitiveProjection.java:69)
at org.hibernate.search.elasticsearch.query.impl.PrimitiveProjection.addDocumentField(PrimitiveProjection.java:43)
at org.hibernate.search.elasticsearch.query.impl.TwoWayFieldBridgeProjection.convertFieldValue(TwoWayFieldBridgeProjection.java:60)
at org.hibernate.search.elasticsearch.query.impl.TwoWayFieldBridgeProjection.convertHit(TwoWayFieldBridgeProjection.java:43)
at org.hibernate.search.elasticsearch.query.impl.QueryHitConverter.convert(QueryHitConverter.java:186)
at org.hibernate.search.elasticsearch.query.impl.IndexSearcher.convertQueryHit(IndexSearcher.java:138)
at org.hibernate.search.elasticsearch.query.impl.ElasticsearchHSQueryImpl.queryEntityInfos(ElasticsearchHSQueryImpl.java:233)
at org.hibernate.search.query.hibernate.impl.FullTextQueryImpl.doHibernateSearchList(FullTextQueryImpl.java:238)
at org.hibernate.search.query.hibernate.impl.FullTextQueryImpl.list(FullTextQueryImpl.java:223)
at org.hibernate.search.query.hibernate.impl.FullTextQueryImpl.getResultList(FullTextQueryImpl.java:122)
Attaching the error point snaspshot:
error snapshot
How should I handle this scenario under hibernate search java code.
Adding the query code:
Query query = queryBuilder.keyword().onField("cltseqnum").matching(to.getCltseqnum()).createQuery();
FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery(query, Basclt1400TO.class);
fullTextQuery.setProjection( "policies_policy0num", "policies_symbol",
"policies_module", "policies_master0co","policies_location", "policies_issue0code",
"policies_trans0stat");
You didn't provide the code used to build the query, but from what I can see you are using projections.
Projections do not support multi-valued fields, so you simply cannot make this work, unless you project on the whole document (using org.hibernate.search.elasticsearch.ElasticsearchProjectionConstants.SOURCE) and parse it yourself, which would be a terrible hack.
I would recommend using "traditional" entity loading (without projections) and getting the data from your entities. Unless you've got tremendous performance constraints, this should result in decent performance, especially if you tune your Hibernate ORM mapping correctly.
Please, observe:
MongoDB shell version: 2.4.1
connecting to: test
> use dummy
switched to db dummy
> db.invoices.find({'items.nameTags': /^z/}, {_id: 1}).explain()
{
"cursor" : "BtreeCursor items.nameTags_1_created_1_special_1__id_1_items.qty_1_items.total_1 multi",
"isMultiKey" : true,
"n" : 55849,
"nscannedObjects" : 223568,
"nscanned" : 223568,
"nscannedObjectsAllPlans" : 223568,
"nscannedAllPlans" : 223568,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 86,
"nChunkSkips" : 0,
"millis" : 88864,
"indexBounds" : {
"items.nameTags" : [
[
"z",
"{"
],
[
/^z/,
/^z/
]
],
"created" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
],
"special" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
],
"_id" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
],
"items.qty" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
],
"items.total" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
]
},
"server" : "IL-Mark-LT:27017"
}
>
Here is the definition of the index:
> db.system.indexes.find({name : 'items.nameTags_1_created_1_special_1__id_1_items.qty_1_items.total_1'}).pretty()
{
"v" : 1,
"key" : {
"items.nameTags" : 1,
"created" : 1,
"special" : 1,
"_id" : 1,
"items.qty" : 1,
"items.total" : 1
},
"ns" : "dummy.invoices",
"name" : "items.nameTags_1_created_1_special_1__id_1_items.qty_1_items.total_1"
}
>
Finally, here is an example invoice document (with just 2 items):
> db.invoices.findOne({itemCount: 2})
{
"_id" : "85923",
"customer" : "Wgtd Fm 91",
"businessNo" : "314227928",
"billTo_name" : "Wgtd Fm 91",
"billTo_addressLine1" : "3839 Ross Street",
"billTo_addressLine2" : "Kingston, ON",
"billTo_postalCode" : "K7L 4V4",
"purchaseOrderNo" : "boi",
"terms" : "COD",
"shipDate" : "2013-07-10",
"shipVia" : "Moses Transportation Inc.",
"rep" : "Snowhite",
"items" : [
{
"qty" : 4,
"name" : "CA 7789",
"desc" : "3 pc. Coffee Table set (Silver)",
"price" : 222.3,
"total" : 889.2,
"nameTags" : [
"ca 7789",
"a 7789",
" 7789",
"7789",
"789",
"89",
"9"
],
"descTags" : [
"3",
"pc",
"c",
"coffee",
"offee",
"ffee",
"fee",
"ee",
"e",
"table",
"able",
"ble",
"le",
"e",
"set",
"et",
"t",
"silver",
"ilver",
"lver",
"ver",
"er",
"r"
]
},
{
"qty" : 4,
"name" : "QP 8681",
"desc" : "Ottoman Bed",
"price" : 1179.1,
"total" : 4716.4,
"nameTags" : [
"qp 8681",
"p 8681",
" 8681",
"8681",
"681",
"81",
"1"
],
"descTags" : [
"ottoman",
"ttoman",
"toman",
"oman",
"man",
"an",
"n",
"bed",
"ed",
"d"
]
}
],
"itemCount" : 2,
"discount" : "10%",
"delivery" : 250,
"hstPercents" : 13,
"subTotal" : 5605.6,
"totalBeforeHST" : 5295.04,
"total" : 5983.4,
"totalDiscount" : 560.56,
"hst" : 688.36,
"modified" : "2012-10-08",
"created" : "2014-06-25",
"version" : 0
}
>
My problem is that mongodb does not use index only according to the aforementioned explain() output. Why? After all I only request the _id field, which is part of the index.
In general, I feel that I am doing something very wrong. My invoices collection has 65,000 invoices with the total of 3,291,092 items. It took almost 89 seconds to explain() the query.
What am I doing wrong?
You are using arrays and subdocuments. Covered Indexes dont work with either of these.
From the mongo docs:
An index cannot cover a query if:
any of the indexed fields in any of the documents in the collection includes an array. If an indexed field is an array, the index becomes a multi-key index index and cannot support a covered query.
any of the indexed fields are fields in subdocuments. To index fields in subdocuments, use dot notation. For example, consider a collection users with documents of the following form:
http://docs.mongodb.org/manual/tutorial/create-indexes-to-support-queries/
{
CONTENT1:{
YDXM:[{
"name":"1",
"MBNH":"1"}
{"name":"2",
"MBNH":"2"}]
}
I want to delete the {"name":"1","MBNH":"1"}. How can I achieve this?
Assuming that the following is your document and you want to delete the ENTIRE document:
{
"CONTENT1": {
"YDXM": [
{
"name": "1",
"MBNH": "1"
},
{
"name": "2",
"MBNH": "2"
}
]
}
}
You could use this:
db.test.remove({"CONTENT1.YDXM.name" : "1", "CONTENT1.YDXM.MBNH" : "1"})
Now, if you want to extract the document {"name" : "1", "MBNH" : "1"} from the CONTENT1.YDXM array, you should use the $pull operator:
db.test.update({"CONTENT1.YDXM.name" : "1", "CONTENT1.YDXM.MBNH" : "1"}, { $pull : { "CONTENT1.YDXM" : {"name" : "1", "MBNH" : "1"} } }, false, true)
This will perform an update in all documents that matches with the first argument. The second argument, with the $pull operator, means that the mongodb will remove the value {"name" : "1", "MBNH" : "1"} from CONTENT1.YDXM array.
You could read more about $pull operator and the update command in this links:
http://docs.mongodb.org/manual/reference/operator/pull/
http://docs.mongodb.org/manual/applications/update/
I'm working on retrieving all the orders for a given shop, using code that looks like this:
orders = ShopifyAPI::Order.find(:all, :params => {:financial_status => 'paid'})
orders.each do |order|
order_json = order.to_json
post_json_to_server(order_json)
end
But for some reason, when I inspect the JSON created by order.to_json, the discount_codes and client_details attributes look like this:
"client_details":{"":{"accept_language":"en-US,en;q=0.8","browser_ip":"199.185.98.174","session_hash":"6b37d22ebcdf097f5ab4e1c9e596a504c4cdc4c41c4f2b29a3a7aae4ead559c3","user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.47 Safari/536.11"}}
"discount_codes":[{"":{"amount":"20.00","code":"CZV57KSE8VMV"}}]
Why is there a leading {"": in both of these lists? Is something misconfigured in my test store?
Here is a complete dump (captured with RequestBin) of the JSON I'm sending:
{ "billing_address" : { "address1" : "asdf",
"address2" : "",
"city" : "asdf",
"company" : "",
"country" : "United States",
"country_code" : "US",
"first_name" : "asdf",
"last_name" : "asdf",
"latitude" : "45.176384",
"longitude" : "-123.045601",
"name" : "asdf asdf",
"phone" : "",
"province" : "Alaska",
"province_code" : "AK",
"zip" : "asdf"
},
"browser_ip" : "199.185.98.174",
"buyer_accepts_marketing" : true,
"cancel_reason" : null,
"cancelled_at" : null,
"cart_token" : "bbf42c99f456f9ccda30554022fec659",
"client_details" : { "" : { "accept_language" : "en-US,en;q=0.8",
"browser_ip" : "199.185.98.174",
"session_hash" : "6b37d22ebcdf097f5ab4e1c9e596a504c4cdc4c41c4f2b29a3a7aae4ead559c3",
"user_agent" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.47 Safari/536.11"
} },
"closed_at" : null,
"created_at" : "2012-07-10T17:23:53-04:00",
"currency" : "CAD",
"customer" : { "accepts_marketing" : true,
"created_at" : "2012-07-10T17:23:53-04:00",
"email" : "asdf#example.com",
"first_name" : "asdf",
"id" : 94366870,
"last_name" : "asdf",
"last_order_id" : null,
"last_order_name" : null,
"note" : null,
"orders_count" : 0,
"state" : "disabled",
"tags" : "",
"total_spent" : "0.00",
"updated_at" : "2012-07-10T17:35:23-04:00"
},
"discount_codes" : [ { "" : { "amount" : "20.00",
"code" : "CZV57KSE8VMV"
} } ],
"email" : "asdf#example.com",
"financial_status" : "authorized",
"fulfillment_status" : null,
"fulfillments" : [ ],
"gateway" : "bogus",
"id" : 134753494,
"landing_site" : "/",
"landing_site_ref" : null,
"line_items" : [ { "fulfillment_service" : "manual",
"fulfillment_status" : null,
"grams" : 0,
"id" : 219421970,
"name" : "Grass-roots methodical instruction set",
"price" : "19.00",
"product_id" : 95843140,
"quantity" : 2,
"requires_shipping" : true,
"sku" : "",
"title" : "Grass-roots methodical instruction set",
"variant_id" : 224399478,
"variant_inventory_management" : null,
"variant_title" : null,
"vendor" : "Shopify"
} ],
"name" : "#1004",
"note" : "",
"note_attributes" : [ ],
"number" : 4,
"order_number" : 1004,
"payment_details" : { "avs_result_code" : null,
"credit_card_bin" : "1",
"credit_card_company" : "Bogus",
"credit_card_number" : "XXXX-XXXX-XXXX-1",
"cvv_result_code" : null
},
"processing_method" : "direct",
"referring_site" : "",
"shipping_address" : { "address1" : "asdf",
"address2" : "",
"city" : "asdf",
"company" : "",
"country" : "United States",
"country_code" : "US",
"first_name" : "asdf",
"last_name" : "asdf",
"latitude" : "45.176384",
"longitude" : "-123.045601",
"name" : "asdf asdf",
"phone" : "",
"province" : "Alaska",
"province_code" : "AK",
"zip" : "asdf"
},
"shipping_lines" : [ { "code" : "International Shipping",
"price" : "20.00",
"source" : "shopify",
"title" : "International Shipping"
} ],
"subtotal_price" : "18.00",
"tax_lines" : [ ],
"taxes_included" : false,
"token" : "51116b93d2d774b6c537a3bcc8861506",
"total_discounts" : "20.00",
"total_line_items_price" : "38.00",
"total_price" : "38.00",
"total_price_usd" : "37.28",
"total_tax" : "0.00",
"total_weight" : 0,
"updated_at" : "2012-07-10T17:35:20-04:00"
}
In an attempt to get the Order formatted the same way as the JSON sent by a webhook, I ended up doing it myself - here is the code:
ActiveResource::Base.include_root_in_json = true
orders = ShopifyAPI::Order.find(:all, :params => {:financial_status => 'paid'})
orders.each do |order|
order_json = order.as_json
%w(billing_address customer line_items payment_details shipping_address shipping_lines).each do |attribute|
order_json[attribute] = order.send(attribute).as_json
end
if order_json['discount_codes'].length > 0
order_json['discount_codes'] = [order_json['discount_codes'].as_json[0][nil]]
end
order_json['client_details'] = order_json['client_details'].as_json[nil]
post_json_to_server(order_json)
end
By doing it this way, I was able to turn the JSON into something that mapped exactly to what Shopify's order paid webhook sends.
When you use to_json without telling ActiveRecord how to treat the root it can do that.
You can tell ActiveRecord to include the root when rendering JSON.
ActiveRecord::Base.include_root_in_json = true
You would then see
{"order":{...}} and not {"":{...}}
Often you can just use the syntax
order_json = order.to_json(:root => true)
in order to get the key (which is the root) you want. Using JSON is still kinda like walking inside a carnival jumpy ride for kids...