Query large Document Library to get 100 Items only

Query large Document Library to get 100 Items only - visual-studio-2010

(SharePoint 2010, Visual Studio, C#)
I have a large SharePoint Document Library called LargeLib (and am concerned about performance).
I have about 100 IDs and I have to extract respective items (with just 3 columns ID, Name, Author).
The CAML Query seems to be very large, as there is no "IS IN" clause possible in CAML. I have to repeat CAML lines of code a hundred times. WIll this be a good option? I wish I could pass it an array of IDs.
Do we have any other performance friendly option?
Thanks a lot in advance as I am stuck on this one.

This SO question seems to be the same as what you're asking. He solved it by building a function for nesting OR nodes...
One of the answers discusses using the <IN> node for SharePoint 2010, so I guess you're in luck:
http://msdn.microsoft.com/en-us/library/ff625761.aspx
HTH

Related

how in OBIEE convert rows into columns in analytics

For a certain program I have some type-keywords values like this:
Program Type Keyword
PIM Kind Additional
PIM Period Education
PIM Phase Specialized
PIM Skills Professional
The type is a fixed value, but the keyword depends of the Program and type. I want to transpose this result in analytics by making 4 columns with the type. The result has to look like this:
Program Kind period phase skills
PIM Additional Education Specialized Professional
I have tried by editing the column formula and putting this formula:
CASE WHEN "Type"='Partial period' THEN "Keyword" END
and so on for each different type. But it doesn't give me the result I want. all the new columns are empty.
I also tried with a pivot table, but the keyword isn't a measure, so I don't think this will work.
can someone help?

This simply doesn't make sense in an analytical way. You have no fact, nothing you measure. So no chance of using FILTER...USING... for example.
Don't forget you're not in Excel or a drawing tool. You're in an analytics tool which tries to make sense out of data and not "show data in a weird way".
You have to model things nicely either in the data source itself or be clever in the construction of your RPD.
It's doable in the RPD but it will be quite static and if the list of values changes you will have to adapt it.
tl;dr - garbage data, garbage result

How to quickly search book titles?

I have a database of about 200k books. I wish to give my users a way to quickly search a book by the title. Now, some titles might have prefix like A, THE, etc. and also can have numbers in the title, so search for 12 should match books with "12", "twelve" and "dozen" in the title. This will work via AJAX, so I need to make sure database query is really fast.
I assume that most of the users will try to search using some words of the title, so I'm thinking to split all the titles into words and create a separate database table which would map words to titles. However, I fear this might not give the best results. For example, the book title could be some 2 or 3 commonly used words, and I might get a list of books with longer titles that contain all 2-3 words and the one I'm looking for lost like a needle in a haystack. Also, searching for a book with many words in the title might slow down the query because of a lot of OR clauses.
Basically, I'm looking for a way to:
find the results quickly
sort them by relevance.
I assume this is not the first time someone needs something like this, and I'd hate to reinvent the wheel.
P.S. I'm currently using MySQL, but I could switch to anything else if needed.

Using a SOUNDEX is the best way i think.
SELECT
id,
title
FROM products AS p
WHERE p.title SOUNDS LIKE 'Shaw'
// This will match 'Saw' etc.
For best database performances you can best calculate the SOUNDEX value of your titles and put this in a new column. You can calculate the soundex with SOUNDEX('Hello').
Example usage:
UPDATE `books` SET `soundex_title` = SOUNDEX(title);

You might want to have a look at Apache Lucene. this is a high performance java based Information Retrieval System.
you would want to create an IndexWriter, and index all your titles, and you can add parameters (have a look at the class) linking to the actual book.
when searching, you would need an IndexReader and an IndexSearcher, and use the search() oporation on them.
have a look at the sample at: src/demo and in: http://lucene.apache.org/java/2_4_0/demo2.html
using Information Retrieval techniques makes the indexing take longer, but every search will not require going through most of the titles, and overall you can expect better performance for searching.
also, choosing good Analyzer enables you to ignore words such "the","a"...

One solution that would easily accomodate your volume of data and speed requirment is to use the Redis key-value pair store.
The way I see it, you can go ahead with your solution of mapping titles to keywords and storing them under the form:
keyword : set of book titles
Redis already has a built-in set data-type that you can use.
Next, to get the titles of the books that contains the search keywords you can use the sinter command which will peform set intersection for you.
Everything is done in memory; therefore the response time is very fast.
Also, if you want to save your index, redis has a number of different persistance/caching mechanisms.

Apache Lucene with Solr is definitely a very good option for your problem
You can directly link Solr/Lucene to directly index your MySQL database. Here is a simple tutorial on how to link your MySQL database with Lucene/Solr: http://www.cabotsolutions.com/2009/05/using-solr-lucene-for-full-text-search-with-mysql-db/
Here are the advantages and pains of using Lucene-Solr instead of MySQL full text search: http://jayant7k.blogspot.com/2006/05/mysql-fulltext-search-versus-lucene.html

Keep it simple. Create an index on the title field and use wildcard pattern matching. You can not possibly make it any faster as your bottleneck is not the string matching but the number of strings you want to match against the title.
And just came up with a different idea. You say that some words can be interpreted differently. Like 12, Twelve, dozen. Instead of creating a query with different interpretations, why not store different interpretations of the titles in a separate table with a one to many to the books. You can then GROUP BY book_id to get unique book titles.
Say the book "A dime in a dozen". In books table it will be:
book_id=356
book_title='A dime in a dozen'
In titles table will be stored:
titles_id=123
titles_book_id=356
titles_title='A dime in a dozen'
--
titles_id=124
titles_book_id=356
titles_title='A dime in a 12'
--
titles_id=125
titles_book_id=356
titles_title='A dime in a twelve'
The query for this:
SELECT b.book_id, b.book_title
FROM books b JOIN titles t on b.book_id=t.titles_book_id
WHERE t.titles_title='%twelve%'
GROUP BY b.book_id
Now, insertions becomes a much bigger task, but creating the variants can be done outside the database and inserted in one swoop.

Remove duplicates from custom entities in Microsoft Dynamics CRM

Has anyone found a good way to either merge or remove duplicates that are in custom entities? In our case we have two custom entities, literature history and subscriptions which relate contacts back to a custom entity named literature.
I can run a duplicate detection job, but this returns thousands of records and deleting them one at a time is impractical at best. We would like to either be able to merge them or just delete the duplicates. However, much Google searching has not turned up any good suggestions other than "you can write something."
Okay, but where to even get started? Should I be bulk deleting from the duplicate detection job? Should I try just writing a quick and dirty c# program with the SDK? Is there a way to merge custom entities that just requires some magical workflow voodoo?
EDIT: FYI What I eventually did was setting the deletion state code using some fun SQL to quickly find duplicates:
UPDATE T1 SET DeletionStateCode = 2
FROM New_subscriptionhistory T1 INNER JOIN New_subscriptionhistory T2 ON t1.New_LiteratureId = T2.New_LiteratureId AND t1.New_ContactId = t2.New_ContactId
AND t1.CreatedOn > t2.CreatedOn AND t1.statecode = 0 AND t2.statecode = 0

You should look into creating a Bulk Delete Job using the SDK.
Here's a short tutorial.

I won't say with certainty that this is the only or the best way, but we've used SQL queries in the _MSCRM database, setting the DeletionStateCode of any duplicated entity to 2.

How to filter a query result?

Currently I can not quickly filter the result of a work-item query - running a query will give us a result table and within this table, there's no mean to filter the table rows to display just the ones containing some certain text.
Do you know how to filter that or have any addons/tools suggestion for that?
Thank you.
Nam.

You could use
Telerik's free Work Item Manager . It lets you search the text of work items as well as other useful filtering and grouping tools. Very useful.

We use Excel integration pretty heavily in our shop. The familiar sort and filter controls work well.

Not sure if this is what you are looking for:
You can "Edit Query" from the toolbox and save it as a new query.

After a while (long :) ), I luckily notice that there's a filter box if we browse the work items in the pending changes pane (open in VS 2010: View - Other Windows - Pending Changes). It's so nice that now I can quickly get the exact items I'm looking for.

Today, I found an addon that may fit my need: Search Work Items for TFS 2010
It is a search box, not exactly the filter for the current query.

(similar problem with this question)
I re-post the answer here:
After a long time, today I found the solution for this: use the MS VS 2010 Team Web Access to open your queries using a web browser!
Advantage when doing this:
No delay when clicking an item
Utilize all browser feature like searching, bookmark, ... to work with the queries
Enjoy!

How to get average of a column in terms of another column?

How to get average of a column in terms of another column and put this value in a third column, and this value will be evaluated for each retrieved record?
"How to perform an opertaion on two columns in a report and insert the result in a third column?"
FYI: I'm using CrystalReport Designer embedded with VisualStudio.NET 2005.

First you need to create formula (with content like "{table.commission}/{table.fare}*100"). I hope next MSDN link will help you (I personally have never used embedded variant of CRD):
Crystal Reports Basic for Visual Studio
Formula Overview

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Query large Document Library to get 100 Items only - visual-studio-2010

This SO question seems to be the same as what you're asking. He solved it by building a function for nesting OR nodes... One of the answers discusses using the <IN> node for SharePoint 2010, so I guess you're in luck: http://msdn.microsoft.com/en-us/library/ff625761.aspx HTH

Related

how in OBIEE convert rows into columns in analytics

How to quickly search book titles?

Remove duplicates from custom entities in Microsoft Dynamics CRM

How to filter a query result?

How to get average of a column in terms of another column?

Categories

Resources