Store Taxonomy (Tree of life) with Couchebase, best practice - data-structures

I previously asked a question about this (How to store a Tree Of Life in MySQL? (Phylum, Class, Order, Family, etc)) and I was using MySQL but after someone suggested using MongoDB, I decided to look around and install Couchbase. (Funny thing, geraldss suggested Couchbase in the comment of my previous question after I installed it).
I will store the Kingdom, Phylum, Class, etc for each fish in a document. But I would also like to have document(s) to help with a multilevel dropdown.
I want the first dropdown to be the 5 Kingdom, and then the next dropdown will populate with the Phylum available for this kingdom., Then the third dropdown will be populate with a list of this Phylum's class, and so on.
Right now I'm trying to find the best practical way to store this.
Should I create 1 single huge document called taxonomy or should I create multiple documents about, let say, each Phylum (still huge documents), or something else?
Little PHP array example to help me create the document(s):
"Fungi" => array(
"type"=>"kingdom",
"origin" => "Latin, derived from Greek - sp(h)onges, sponges",
"description" => "Obtain food through absorption, excrete enzymes for digestion",
"example" => "molds, mushrooms, lichens",
"Phylum" => array(),
),
"Plantae" => array(
"type"=>"kingdom",
"origin" => "Latin - plant",
"description" => "Multicellular organisms that are autotrophic",
"example" => "mosses, ferns, grasses, flowers, trees.",
"Phylum" => array(),
),
"Animalia" => array(
"type"=>"kingdom",
"origin" => "Latin - breath, soul",
"description" => "Multicellular organisms that develop from the fertilization of an egg by a sperm",
"example" => "sponges, worms, insects, fish, birds, humans",
"Phylum" => array( //32 total, 12 here
"Porifera" => array(
"name" => "Porifera",
"type" => "Phylum",
"origin" => "Latin - to bear pores",
"description" => "Sponges",
),
"Cnidaria" => array(
"name" => "Cnidaria",
"type" => "Phylum",
"origin" => "Greek - nettle",
"description" => "",
"Class" => array(
"Hydrozoa" => array(
"name" => "Hydrozoa",
"description" => "Hydras",
"Order" => array( //Multiple Order
"Family" => array( //Multiple Family for each Order
"Genus" => array( //Multiple Genus for each Family
),
),
),
),
What should I do? Would it be a good approach to create one single document? How would you store this?
(Please don't close this question, I'm trying very hard to switch my mind from relational to document-based DB and I'm struggling with this)

Related

Updating a member's group interests using MailChimp API V3

I am running the following code (I've hidden ID's) to add/update a subscriber's interest groups in a MailChimp list:
$mailchimp->patch('lists/1234567/members/' . md5('test#test.com'), [
'status' => 'subscribed',
'merge_fields' => array(
'FNAME' => 'Ben',
'LNAME' => 'Sinclair',
),
'interests' => array(
'abcd1234' => true,
'defg6789' => true,
),
]);
The interests key is what I'm having issues with.
I presumed whatever you put in this key will overwrite what currently exists.
It doesn't seem to be the case. It only adds new interests but does not remove any if the ID's are not in the array. I am not getting any errors.
Does anyone know how to overwrite interest groups? Or if that's not possible, is there a way to remove interest groups?
For completion I wanted to add this answer so people stumbling upon this post can find a quick solution.
$mailchimp->patch('lists/1234567/members/' . md5('test#test.com'), [
'status' => 'subscribed',
'merge_fields' => array(
'FNAME' => 'Ben',
'LNAME' => 'Sinclair',
),
'interests' => array(
'abcd1234' => true, // Attached
'defg6789' => false, // Detached
)
]);
In this example the interest 'abcd1234' will be attached and the interest 'defg6789' will be detached.
Other interests that are not listed will remain on their original value.

CakePHP pagination in existing find() operations

I thought a bit too late about the pagination in my cakephp project, so now I have existing (complex) HABTM find operations (with dynamic order, tag search etc..) is it possible to do a cakephp pagination at this point? Or is it better/easier to do the pagination by my own (find next 20 entries from ID xx...)?
I've searched a long time for a solution but actually i've found nothing useful
No matter what you do , how complex app you make , you just need some syntax changes to make pagination , I had same , please check the solution i implemented.
$conditionSearchLessonsByCourse = array('CourseLessonsReference.is_active'=>1,
'Course1.is_active'=>1 ,
'CourseCategory.is_active'=>1,
'Reference.is_active'=>1
);
//Pagination logic
$this->paginate = array('conditions' => $conditionSearchLessonsByCourse,
'order' =>'CourseLessonsReference.id DESC',
'limit' => PAGINATION_LIMIT,
"joins" => array(
array(//UserCourse = Course Join
"table" => "courses",
"alias" => "Course1",
"type" => "INNER",
"conditions" => array(
"Course1.id = CourseLessonsReference.course_id"
)
),//For Category = Course Join
array(
"table" => "course_categories",
"alias" => "CourseCategory",
"type" => "INNER",
"conditions" => array(
"CourseCategory.id = Course1.course_category_id"
)
)
),
'recursive' => 2
);
$allLessonReferences = $this->paginate('CourseLessonsReference');

Wordpress failing to sort and filter in same wp_query with meta values using relational operators

I am trying to implement a filter and sort system for woo-commerce products in Wordpress.
There are two prices in my system (buy and rent)
Below is the argument array:
$args = array(
"post_type" => "product",
'orderby' => 'meta_value_num',
// This is to order by price
'meta_key' => '_regular_price',
'order'=>'ASC',
'meta_query' => array(
'relation' => 'AND',
array(
// This is for getting price values greater than 1 in the ordering
'key' => '_regular_price',
'value' => 0,
'compare'=>'>'
),
array(
"relation" => 'OR',
array(
"key" => array('_regular_price','rent_price'),
"value" => array(1, 1000),
"compare" => 'BETWEEN',
"type" => 'DECIMAL'
),
array(
"key" => array('_regular_price','rent_price'),
"value" => array(5000, 10000),
"compare" => 'BETWEEN',
"type" => 'DECIMAL'
),
)
),
);
If i remove the order by clause it works ok, or else if i remove the filter clause it works fine. But both of them together wont work.
Here are the conditions i am looking to implement:
Order by buy price or rent price (No price range selected & Price > 0)
Filter products between price range (single price range or multiple)
Both of the above together

How to sort on analyzed/tokenized field in Elasticsearch?

We're storing a title field in our index and want to use the field for two purposes:
We're analyzing with an ngram filter so we can provide autocomplete and instant results
We want to be able to list results using an ASC sort on the title field rather than score.
The index/filter/analyzer is defined like so:
array(
'number_of_shards' => $this->shards,
'number_of_replicas' => $this->replicas,
'analysis' => array(
'filter' => array(
'nGram_filter' => array(
'type' => 'nGram',
'min_gram' => 2,
'max_gram' => 20,
'token_chars' => array('letter','digit','punctuation','symbol')
)
),
'analyzer' => array(
'index_analyzer' => array(
'type' => 'custom',
'tokenizer' =>'whitespace',
'char_filter' => 'html_strip',
'filter' => array('lowercase','asciifolding','nGram_filter')
),
'search_analyzer' => array(
'type' => 'custom',
'tokenizer' =>'whitespace',
'char_filter' => 'html_strip',
'filter' => array('lowercase','asciifolding')
)
)
)
),
The problem we're experiencing is unpredictable results when we Sort on the title field. After doing a little searching, we found this at the end of the sort man page at ElasticSearch... (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-sort.html#_memory_considerations)
For string based types, the field sorted on should not be analyzed / tokenized.
How can we both analyze the field and sort on it later? Do we need to store the field twice with one using not_analyzed in order to sort? Since the field _source is also storing the title value in it's original state, can that not be used to sort on?
You can use the built in concept of Multi Field Type in Elasticsearch.
The multi_field type allows to map several core_types of the same value. This can come very handy, for example, when wanting to map a string type, once when it’s analyzed and once when it’s not_analyzed.
In the Elasticsearch Reference, please look at the String Sorting and Multi Fields guide on how to setup what you need.
Please note that Multi Field mapping configuration has changed between Elasticsearch 0.90.X and 1.X. Use the appropriate following guide based on your version:
0.90 Multi Field Type
1.X Multi Field Type

cakephp 2.1 contain second level different field name

I can't get the values from a second level contain using different field name as link.
Asistencia model:
var $belongsTo = array('Employee');
Horario model:
var $belongsTo = array('Employee');
Employee model:
var $hasMany = array(
'Horario' => array(
'foreignKey' => 'employee_id',
'dependent' => true
),
'Asistencia' => array(
'foreignKey' => 'employee_id',
'dependent' => true
)
);
I'll explain using these values on my example record:
Asistencia: employee_id = 3701
Employee : id = 3701
In my find() from Asistencia, I get to contain Employee by switching Employee primaryKey just fine:
$this->Asistencia->Employee->primaryKey = 'id';
$this->paginate = array(
'contain' => array(
'Employee' => array(
'foreignKey' => 'employee_id',
//'fields' => array('id', h('emp_ape_pat'), h('emp_ape_mat'), h('name')),
'Horario' => array(
'foreignKey' => 'employee_id',
//'fields' => array('id' )
))
),
'conditions' => $conditions,
'order' => 'Asistencia.employee_id');
However, my Horario record is linked to Employees via another field: emp_appserial
Employee : emp_appserial = 373
Horario : employee_id = 373
How can my Employee array contain Horario array? (they do share the value just mentioned).
Currently, the Horario contain is using the value on Asistencia.employee_id and Employee.id (3701). (checked the sql_dump and is trying to get the Horario via
"WHERE `Horario`.`employee_id` = (3701)"
but for the Employee to contain Horario, it should use the value on Employee.emp_appserial and Horario.employee_id (373).
This is what i get (empty array at bottom)
array(
(int) 0 => array(
'Asistencia' => array(
'id' => '5',
'name' => null,
'employee_id' => '3701',
'as_date' => '2012-11-19',
),
'Employee' => array(
'id' => '3701',
'emp_appserial' => '373',
'emp_appstatus' => '8',
'AgentFullName' => '3701 PEREZ LOMELI JORGE LORENZO',
'FullNameNoId' => 'PEREZ LOMELI JORGE LORENZO',
'Horario' => array()
)))
Please notice:
'employee_id' => '3701', (Asistencia)
and
'emp_appserial' => '373', (Employee)
my Horario has 'employee_id' = 373.
How could I make the switch so the relation Employee<->Horario is based on emp_appserial?
Thank you in advance for any help.
Firstly you may be getting problems because you're using Spanish words for your Model names and I suspect you're also using them for Controllers.
This is a very bad practice since CakePHP's idea is:
Convention over configuration.
This is achieved through the Inflector class which "takes a string and can manipulate it to handle word variations such as pluralizations or camelizing and is normally accessed statically". But this works ONLY WITH English words. What this means for you is that you may be missing the data because Cake is not able to build the DB queries right, since you're using Spanish words. By doing so you're making the use of CakePHP's flexible persistence layer obsolete - you will have to make all the configurations by hand. Also most likely pagination will NOT WORK. Other parts of the framework may also not work properly:
HtmlHelper, FormHelper, Components, etc. ...
These problems will expand if you try to use more complex Model associations such as HABTM or "hasMany through the Join Model".
So I do not know why exactly you aren't seeing the Horario record, but what I just explained most likely is your problem.
What you're trying to do is against the core principles of CakePHP and you'd save yourself a lot of problems if you refactor a bit and use English words. Your other option is NOT TO use Cake.

Resources