Is there a way to randomize search results (record ids) with Sphinx? - random

I have a complex SphinxQL query which, at the end, orders results by a specific field, Preferred, so that all records with that indexed value of Preferred=1 come before all records w Preferred=0. I also order by weight() so basically I end up with:
Select * from idx_X where MATCH('various parameters') ORDER by Preferred DESC,Weight() Desc
The problem is that, though Preferred records come first I end up with records sorted by ID which puts results from one field, Vendor, in blocks so for instance I get:
Beta Shipping
Beta Shipping
Beta Shipping
Acme Widgets
Acme Widgets
Acme Widgets
Acme Widgets
Acme Widgets
Which doesn't serve my purposes in this case well (often one 'Vendor' will have 1000 results)
So I'm looking to essentially do:
ORDER BY Preferred DESC,weight() DESC,ID RANDOM
So that after getting to Preferred Vendors whose weight is (e.g.) 100, I will get random Vendors vs blocks of them.
Update: Though I did find what appears to be a possible answer in another Stackoveflow Question
The issue is it seems to require the SPH_SORT_EXTENDED and I am forced to use SPH_RANK_PROXIMITY (ranker=proximity) and I am unclear if I can combine ranking and sorting.
Update 2: If I remove my existing two-level Order and just do Order by Rand() it indeed returns random IDs. However I cannot add Rand() after Order by Preferred DESC,Weight() DESC or I get the following error:
1064 - sphinxql: syntax error, unexpected '(', expecting $end near '()

Sadly yes, RAND() only works as a single sort order expression, but it DOES work as a select function....
Select *, RAND() AS r from idx_X where MATCH('various parameters')
ORDER by Preferred DESC,Weight() Desc, r DESC
Or if want a more consistent ordering, but still mixed, can for example use CRC32() function on a string atribute
Select *, CRC32(title) AS r from idx_X where MATCH('various parameters')
ORDER by Preferred DESC,Weight() Desc, r DESC
Can also just limit results to a few per vendor (vendor will need to be an attribute)
Select * from idx_X where MATCH('various parameters')
GROUP 3 BY vendor_id ORDER by Preferred DESC,Weight() Desc
Group by N is a little known by very useful sphinx feature.

Related

Oracle Sql group function is not allowed here

I need someone who can explain me about "group function is not allowed here" because I don't understand it and I would like to understand it.
I have to get the product name and the unit price of the products that have a price above the average
I initially tried to use this, but oracle quickly told me that it was wrong.
SELECT productname,unitprice
FROM products
WHERE unitprice>(AVG(unitprice));
search for information and found that I could get it this way:
SELECT productname,unitprice FROM products
WHERE unitprice > (SELECT AVG(unitprice) FROM products);
What I want to know is why do you put two select?
What does group function is not allowed here mean?
More than once I have encountered this error and I would like to be able to understand what to do when it appears
Thank you very much for your time
The phrase "group function not allowed here" is referring to anything that is in some way an "aggregation" of data, eg SUM, MIN, MAX, etc et. These functions must operate on a set of rows, and to operate on a set of rows you need to do a SELECT statement. (I'm leaving out UPDATE/DELETE here)
If this was not the case, you would end up with ambiguities, for example, lets say we allowed this:
select *
from products
where region = 'USA'
and avg(price) > 10
Does this mean you want the average prices across all products, or just the average price for those products in the USA? The syntax is no longer deterministic.
Here's another option:
SELECT *
FROM (
SELECT productname,unitprice,AVG(unitprice) OVER (PARTITION BY 1) avg_price
FROM products)
WHERE unitprice > avg_price
The reason your original SQL doesn't work is because you didn't tell Oracle how to compute the average. What table should it find it in? What rows should it include? What, if any, grouping do you wish to apply? None of that is communicated with "WHERE unitprice>(AVG(unitprice))".
Now, as a human, I can make a pretty educated guess that you intend the averaging to happen over the same set of rows you select from the main query, with the same granularity (no grouping). We can accomplish that either by using a sub-query to make a second pass on the table, as your second SQL did, or the newer windowing capabilities of aggregate functions to internally make a second pass on your query block results, as I did in my answer. Using the OVER clause, you can tell Oracle exactly what rows to include (ROWS BETWEEN ...) and how to group it (PARTITION BY...).

Google DOC - Multiple FIlters combined

I'm a newbie when it comes to Google Doc Filters and I would appreciate some help.
I got a list with articles where I would like to filter by company (in this case SONY), but also filtering by lowest price combined with lowest shipping costs.
Example: the first filter I created, creates a list with SONY articles.
=(filter(A2:D12;A2:A12="SONY"))
Now I would like the filter to give out a single row, where the price and the shipping costs are the lowest, in this case, the product is:
SONY headphones with the price of 20 and shipping costs of 2,99
Im basically trying to combine the filters:
=(filter(A2:D12;A2:A12="SONY"))
=SMALL((C2:C12);2)
=SMALL((D2:D12);2)
in one single, long filter
Thank you
SEE SCREENSHOT HERE
Solution:
FILTER would not work in your case because you have a priority column to be filtered, in this case, column A before C and D.
You may use QUERY instead:
=QUERY(A2:D12,"select * where A='A' order by C+D limit 1")
This would select the entry with a specified value in column A (company), then order by the sum of C and D (price+shipping) in ascending order, and then output the first row, which is the minimum.
Sample Sheet:
References:
QUERY function
QUERY language

Oracle duplicate field but still correct

So i built a query for my leadership team that was correct, but i dont understand why oracle gave me the correct answer.
i have 3 tables that i needed to get data out of in order to get the total billed amount.
Here is my query (please forgive me, my 2nd post and im not sure how to properly format my querys)
select b.total_amount_billed as billed from t1.billing_information b
where b.billing_no in
(select h.billing_no
from t1.res_history h where h.res_seq_no in
(Select r.reservation_seq_no
from t1.res r where r.customer_order_no in ('THO40000') ))
so in the deepest select, i take the the sequence number where my customer order number was THO40000, this query returns 2 sequence numbers.
the second sub query returns the billing numbers for my order from the history table where the sequence number match, in this case for this order they both use the same billing number, 312000.
the final select, returns my total billed amount where it matched my billing numbers it found, in my case $110.
the query works, but what i dont understand is why is it not duplicated? why does it not return 110, for each time it found 312000, giving me 2 records of 110? the billing number is a PK in the billing_information table. im not sure why it worked without me using the distinct keyword on the query for the billing number.
anyway thanks for the help, ill do my best to explain if you have questions!
You are being saved because you used IN to get the billing_no values to use, rather than an INNER JOIN between the two tables using b.billing_no = h.billing_no. A join would have duplicated the records, but your IN query is essentially this:
select b.total_amount_billed as billed
from t1.billing_information b
where b.billing_no in (312000, 312000);
If there is a single row in billing_information having billing_no equal to 312000, it is in the list, so the WHERE condition is true and it is included in the results. The fact that it is in the list twice doesn't make the IN condition "more true".

Distinct function somehow not giving back all types

Feel free to check it on my latest project: http://arda-maps.org:2480 arda arda as login.
Now check select distinct(type) from Location
Here you get 8 records (River,Lake,Region,City,Island,House,Mountain,Hill). But actually there are way more...
To show you that the distinct somehow is not giving back all distinct we search for one specific vertex with another type:
select * from Location where name = "Citadel of Gondor"
So am I using distinct in a wrong way. Or what could be the reason for the incomplete result list?
Indeed, the order the query parts are applied in the result set is not very obvious. You typed:
select distinct(type) from Location
and the orient studio applies a limit 20 (unless you change that or include a different limit in your query). So the query that finally runs is
select distinct(type) from Location limit 20
Now, this could mean one of the two:
Find at most 20 locations and give me their distinct types
Find at most 20 distinct types of all locations
Obviously, what you expect is the 2nd and what happens is the 1st. The solution is to have inner queries so limit will explicitly apply on the outer query:
select from (select distinct(type) from Location) limit 20
This now clearly says, find me the distinct types of all locations and return me at most 20 (which is the same as (2))
Weird. But if you try:
select distinct(type) from Location limit -1
you'll have all 64 distinct entrances.

How to make Tableau run query for combined multiple selections in quick filter, example attached

It is hard to describe my question in the subject line. Here is an example.
I want Tableau to run query to show only Account ID that has both 2 products i selected in Product A quick filter.In this example only the second Account ID should qualify . Is this possible?
Thanks for your help in advance!
Hmm, good question. It is not possible in the way you want (at least I can't think of a way to do that), with quick filters.
I can solve your specific problem (filtering customers that have at least 2 specific products in their history), but expanding for variable n products can be really troublesome.
So first thing, create 2 parameters. Product1 and Product2. Each is a string, and you can get a list from the [Product A] field. You will use this 2 parameters to specify the 2 products you want.
Now create a calculated field, [Product flag]:
IF [Product A] = [Product1] OR [Product A] = [Product2]
THEN 1
END
Now drag [Account ID] to the filters shelf. Open the filter options and go to condition. Now select By field, [Product flag], Sum, = 2
That will work if there are not duplicated [Product A] under the same [Account ID]. If that can happen, you need a little bit more sophisticated approach. [Product Flag] becomes:
IF [Product A] = [Product1]
THEN 1
ELSEIF [Product A] = [Product2]
THEN 2
END
And the condition should be Count (Distinct) = 2
In both cases it will keep only the Account IDs that have both the products you selected under them. They can have other products under them.
EDIT: For the N product problem, I believe you're going to use a solution outside Tableau. One possibility is to use the JS API, so you can select the products you need in a JS interface, and pass a parameter to Tableau.
In JS you could have a list you could select as many items you want, and a script to pass a parameter to Tableau based on the selection. Could be something like: product1,product2,product3...
Then you could use CONTAINS() to see if that product is in that list (and raise a flag), and make a count of ',' to see how many products were selected.
Unfortunately I have very limited knowledge on JS API, but I strong encourage you to take a look
Really interesting question. It's surprisingly trickier to list the accounts that reference every product in a list than it is to list the accounts that reference any product in a list.
If you are willing to start with a less convenient user interface (suitable for ad-hoc analysis but not published dashboards) then try the following:
Create a filter based on Account Id, select Use all on the General tab, and By formula on the Condition tab. Enter the formula
Count(if [Product A] = "Business Office Consolidation" then 1 end) > 0 and Count(if [Product A] = "Cabled Barcode Scanner" then 1 end) > 0
This will only filter to only include Account IDs that reference both products. You can extend this to a list of any number of required products. For relational data sources, it is implemented using a HAVING clause.
Of course, it can be tedious to revise this formula by hand, but it is one way to accomplish your analysis goal, and it can be instructive to understand how filter conditions work. Similar formulas are useful for many conditions.
You can create one or more dynamic sets using the same approach and then use them in calculated fields, any shelf in Tableau and combine them to create new sets. You can also move the formula to a calculated field for convenience.
Note, the 1 in the formula is not significant, any non-null value would work. Since there is no else clause, the formula evaluates to null for rows that fail the if test. And the Count() function just counts the number of rows that have non-null values for the expression.
To come up with an approach that lets you easily select products from a list without editing a formula, will probably take some combination of more advanced features. I don't have an answer for you right now, but the features that are worth learning about that may or may not be part of the solution include filter actions, context filters, top filters, count distinct, custom SQL, computed sets, table calculations, LOD expressions and the Javascript API. This would also be a good questions to pose, with an example workbook, on the Tableau online forums at http://www.tableau.com under the Support menu.

Resources