"Random Article" Feature on wikipedia.com [closed] - random

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
i would like to know what algorithm and what programming language wikipedia is using to randomly choose an article to display.
i would also like to know how does it work so fast?

Here's information on that.
Every article is assigned a random number between 0 and 1 when it is created (these are indexed in SQL, which is what makes selection fast). When you click random article it generates a target random number and then returns the article whose recorded random number is closest to this target.
If you are interested you can read the actual code here.

Something along this lines:
"SELECT cur_id,cur_title
FROM cur USE INDEX (cur_random)
WHERE cur_namespace=0 AND cur_is_redirect=0
AND cur_random>RAND()
ORDER BY cur_random
LIMIT 1"

From MediaWiki.org:
MediaWiki is a free software wiki
package written in PHP, originally
for use on Wikipedia. It is now used
by several other projects of the
non-profit Wikimedia Foundation and by
many other wikis, including this
website, the home of MediaWiki.
MediaWiki is open source, so you can download the code and inspect it, to see how they have implemented this feature.

If you look at the source, they use PHP/MySQL a sort and filter pages by pregenerated random values (page_random column) that have an index on them.

Related

Google Image Search Path [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
SEO Question, if the images on a server are allowed to be index, named wisely with descriptive names and aren't oversized does the image path or folder make a difference for the ranking of that image in google image search?
Eg, Is pic A ranked higher than pic B (below) - if so, why?
A: /images/cat-on-a-chair.jpg
B: /images/repository/cat-on-a-chair.jpg
thanks
It'd be too difficult to run a controlled case study on a factor that, if it did help, would be too miniscule to notice.
The short answer: it's highly unlikely.
Think of the image itself and the page the image is found on as two completely separate entities (they are, indeed). When you do a Google Image search, you are finding pages that contain that image. So a highly-ranked page is likely going to be a good candidate for image results. You aren't actually being returned direct images.
Other things that influence ranking for images would include image-specific data like ALT tags, description, the image name, and so forth.
For reference, here are paths for top five results for horses:
http://upload.wikimedia.org/wikipedia/commons/thumb/8/85/Points_of_a_horse.jpg/330px-Points_of_a_horse.jpg
http://upload.wikimedia.org/wikipedia/commons/thumb/9/98/Horse-and-pony.jpg/310px-Horse-and-pony.jpg
http://images4.fanpop.com/image/photos/23500000/horse-horses-23582505-1024-768.jpg
http://www.hedweb.com/animimag/horses-gallop.jpg
http://www.horsesmaine.com/images/2%20%20horses.jpg
Scientifically, that's such a small sample that it's not worth mentioning. But let's assume it is: the majority of the results don't have relevant keywords in a directory path. Instead, a very highly-ranked website gets the first few positions.
If you wanted to take this further you could write a script to get a bigger sample, but at this point I'm hoping you've arrived at the conclusion that no, it doesn't make a difference.

Propose please an open data for graphs [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
I should to prepare myself for upcoming task which consist of a lot of graphs.
I need some data (available in free domain) to train myself.
Bigger - is better...
could you suggest some open data resource?
I'll appreciate this.
You can visit http://snap.stanford.edu/data/ . It contains many different kind of network or graph data.
Here is an answer for your could you suggest some open data resource? and not for which consist of a lot of graphs. So, plz, keep it in mind.
Here (data.gov.au) you can find a huge datasets (864!) of a different types in a different formats (txt, csv, xml, ). You will find a Finance, Industry, Geography, etc. datasets.
In other case, if you want some special (and meaningful data, for example, global population density) you can see this (a bit outdated, but usefull) source from readwriteweb.com.
And one more source: "Open Governmental Datasets" - it's worth to see it indeed.

Algorithms to compose music [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
Suppose I want to write an application to compose music.
I'd like to feed a set of musical scores of a composer - e.g. Bach's Well tempered clavier' - and the program should prepare new scores in a similar style.
Are algorithms or even libraries known for this task?
WikiPedia provides this page about algorithmic music composition.
You could make something basic using Markov chains. The principle is to first produce some unit of music (a single note, for example) and then, based on the last produced unit, randomly select the next unit.
First, pass through the input music. Each time you see a particular note/other unit of music, simply record in the table what came after it. When you have gone trough the entire input material, you will have a frequency table of which units follow which (After 'A', 'B' appeared 29 times, 'C' appeared 12 times and 'A' appeared twice; after 'B' ... etc).
Now select an initial note. Select the next one randomly according to the frequencies recorded in the table. Repeat until satisfied.
This will probably not yield good results if applied to individual notes, instead try short phrases. Also, the quality will improve if you have access to a large corpus of source music.

Generating first names based on race/ethnicity [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 9 years ago.
Improve this question
I want to generate some fake people for a piece of software I'm writing. These people also have ethnicities, and I'd prefer not to have names that don't look totally out of place when compared to those ethnicities.
My first idea was to base it on data. There is a table of first and last names from the 1990 US census with attached frequencies, but that says nothing about ethnicity. There is also a table of last names from the 2000 US census which is broken down by ethnicity, but it says nothing about first names.
So I need some way of generating first names based on ethnicity. Any ideas?
Use behindthename.com. They have very extensive lists of names by usage, including lists of popular names.
The site http://www.babynamefacts.com/ contains lists of most popular baby names per country. This may be a good starting point. For example, this page shows the most popular baby names for Serbia in 2009: http://www.babynamefacts.com/popularnames/countries.php?country=SRB .

Google similar images algorithm [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
Does any one have an idea regarding what sort of algorithm might Google be using to find similar images ?
No, but they could be using SIFT.
I'm not sure this has much to do with image processing. When I ask for "similar images" of the Eiffel tower, I get a bunch of photos of Paris Hilton, and street maps from Paris. Curiously, all of these images have the word "Paris" in the file name.
Currently the Google Image Search provides these filtering options:
Image size
Face detection
Continuous-tone ("Photo") vs. Smooth shading ("Clipart") vs. bitonal("Line drawing")
Color histogram
These options can be seen in its Image Search Result page.
I don't know about faces, but see at least:
http://www.incm.cnrs-mrs.fr/LaurentPerrinet/Publications/Perrinet08spie
Compare two images the python/linux way
I have heard, that one should use this when comparing images
(I mean: make the prob model, calc. the probs, use this):
http://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence
Or then it might even be one of those PCFG things that MIT people tend to use with robotics stuff. One I read used some sort of PCFG model made of basic shapes (that you can rotate magically) and searched the best match with
http://en.wikipedia.org/wiki/Inside-outside_algorithm

Resources