I'm working on a program to "optimally" buy magic cards. On the site each user has a "mini-shop", think eBay without the auctions.
The users enters a list of cards he wants to buy, I then fetch all offers from the site and print an "optimal" shopping list. Optimal meaning cheapest. Prices differ in the shops and also postage changes depending on how many cards you buy.
I would like to implement some algorithm which creates that list for me. I have written one, which works(I think), but I have no idea how good it works.
So my question is this: Can this problem be solved by some existing algorithm? It would need to deal with ~1000 offers for EACH card (normally 40-60 cards, so around 50k different offers)
Can somone point me in the correct direction on this?
The "partition" or "bin packing" problems (which are both mappable to what you want to do) is known to be NP-complete. Thus, the only way to make SURE that you have the optimal solution is to try all possible solutions and pick the best way.
If the user wants to buy 1,000 cards, trying all possible options is not computationally feasible, so you need to use heuristics.
Related
I work in a B2B e-commerce company and we want improve our user experience with a function called "Magic shopping cart".
Let me explain :
Our website is a marketplace with multiple sellers selling a range of products with a limited stock per product, the point of our function is to make our customers find the cheapest cart for all the products they wish to buy.
At the moment customers need to search through all the website to find the best prices and to regroup a maximum of products on the same seller to reduce shipping fees.
We are searching for an algorithm that does all the research for our customers, meaning finding the best combination of sellers and products in order to buy the cheapest products.
We have done a function that combines all possible shopping cart for given products and quantities than we test which one is the cheapest, this is flawless except it takes way too much time.
We need a quicker/ more efficient way to find the cheapest cart, we have thought of machine learning (we are no experts) but we are open to all ideas.
Conventional algorithms offer better speed in most cases as compared to machine learning algorithms. If the customer wishes particular goods, and there is already a list of ALL offerings of these goods, then you just need an efficient search algorithm.
Machine learning would help you to identify which goods match which classes, for example, but this is not the problem you are trying to solve apparently.
Perhaps you are looking for some trade-off between the speed and quality of the magic cart feature (not optimum, but a good solution). In such case, there just might be space for using some machine learning, but it takes more specific formulation of the search task to come up with specific algorithm!
You might as well look into evolutionary algorithms and other optimization methods.
Hospitals are changing the way they sterilize their equipment. Previously the local surgeons kept all their own equipment and made their own surgery trays. Now they have to confine to a country wide standard. They want to know how many of the new trays they can make from their existing stock, and how much new equipment they need to buy.
The inventory of medical equipment looks like this:
http://pastebin.com/rstWSurU
each hospitals has codes for various medical equipment and then a number for how many they have of the corresponding item
3 surgery trays with their corresponding items are show in this dictionary.
http://pastebin.com/bUAZhanK
There are a total of 144 different operation trays
the hospitals will be told they need 25 of tray x, 30 of tray y, etc...
They would like to maximize the amounts of trays they can finish with their current stock. They would also like to know what equipment they need to purchase in order to finish the remaining trays.
I have thought about two possible solutions one being representing the problem as a linear programming problem. The other solving the problem by doing a round-robin brute force solve of the first 90% of the problem and solving the remaining 10% by doing a randomized algorithm several times and then pick the best solve of those tries.
I would love to hear if anyone knows a smart way of how to tackle this problem!
If I understand this correctly we can optimize for each hospital separately. My guess is that the following would be a good start for an MIP (Mixed Integer Programming) model:
I use the following indices: i is items and t is trays. x(t,i) indicates how many items we assign to each tray type. y(t) counts the number of trays of each type that we can compose using the available items. From the solution we can calculate the shortages that we need to order.
Of course we are just maximizing the number of trays we can make. There is no consideration of balancing (many trays of one type and few or zero of another). I mitigate a little bit by not allowing to create more trays than required (if we have more items they need to go to other types of trays). This requirement is formulated as an upper bound on y(t).
For large problems we can restrict the (t,i) combinations to the ones that are possible. This will make the model smaller. When using precise math notation:
A further optimization would be to substitute out the variables x(t,i).
Adding shipping surplus items to other hospitals would make the model more difficult. In that case we could end up with a model that needs to look at all hospitals simultaneously. May be an interesting case for some decomposition approach.
I want to find the best matching algorithm to recreate an economic simulation.
I will create differents groups of customers. Each group will have particular parameters that will determine what the customers wants to buy. Example of those parameters : quality, features, marketing, etc.
Each player in my game will create differents products and try to fill out the needs of the differents groups of customers. Then, they will put a price on each product, and decide how much they will produce (limited quantity).
So, on one side, you have a limited quantity of customers. On they other side, you have a limited quantity of products. These to quantities do not need to be equal (but it can be). So you might have too much products for the quantity of customers, or too much customer for quantity of products. But one thing is sure : every customer wants to buy a product, unless there is a shortage.
I found the stable mariage algorithm, but this one doesn't seem to fit exactly my situation. What would be the best matching algorithm for this?
This question is related to a previous post about similar subject :
An algorithm for economic simulation?
One way to think about this problem is as a maximum-weight bipartite matching problem. In your setup, you can think of the problem as a graph with two groups of nodes:
Nodes corresponding to customers
Nodes corresponding to products
There is an edge pairing up each customer with the products that they're interested in buying, with the cost of an edge being how much the customer wants that particular product. Since customers aren't paired with customers and products aren't paired with products, this graph is bipartite.
Given this setup, one option would be to find a matching in this graph with the maximum possible possible total benefit (that is, maximizing the total amount of utility given by people buying the appropriate products). This way, everyone who can buy something will end up doing so, unless other people so disproportionately want the products that that customer wants that it makes more sense for that person not to get any of his preferred products. There are many algorithms for maximum-weight bipartite matching, and they run fairly quickly.
Hope this helps!
Or The Traveling Salesman plays Magic!
I think this is a rather interesting algorithmic challenge. Curious if anyone has any good suggestions for solving it, or if it is already solvable in a known way.
TCGPlayer.com sells collectible cards for a variety of games, including Magic the Gathering. Instead of just selling cards from their inventory they are actually a re-seller from multiple vendors (50+). Each vendor has a different inventory of cards and a different price per card. Each vendor also charges a flat rate for shipping (usually). Given all of that, how would one find the best price for a deck of cards (say 40 - 100 cards)?
Just finding the best price for each card doesn't work because if you order 10 cards from 10 different vendors then you pay shipping 10 times, but if you order all 10 from one vendor you only pay shipping once.
The other night I wrote a simple HTML Scraper (using HTML Agility Pack) that grabs all the different prices for each card, and then finds all the vendors that carry all the cards in the deck, totals the price of the cards from each vendor and sorts by price. That was really easy. The total prices ended up being near the total median price for all the cards.
I did notice that some of the individual cards ended up being much higher than the median price. That raises the question of splitting an order over multiple vendors, but only if enough savings could be made by splitting the order up to cover the additional shipping (each added vendor adds another shipping charge).
Logically it seems that the best price will probably only involve a few different vendors, but if the cards are expensive enough (and some are) then in theory ordering each card from a different vendor could still result in enough savings to justify all the extra shipping.
If you were going to tackle this how would you do it? Pure brute force figuring every possible combination of card / vendor combinations? A process that is more likely to be done in my lifetime would seem to involve a methodical series of estimates over a fixed number of iterations. I have a couple ideas, but am curious what others might suggest.
I am looking more for the algorithm than actual code. I am currently using .NET though, if that makes any difference.
I would just be greedy.
Assume that you are going to eat the shipping cost and buy from all vendors. Work out the absolute lowest price you get. Then for each vendor work out how much being able to buy some cards from them versus someone else saves you. Order the vendors by shipping - incremental savings.
Starting with the vendors who provide the least value, axe that vendor, redistribute their cards to the other vendors, and recalculate incremental savings. Wash, rinse, and repeat until your most marginal vendor is saving you money.
This should find a good solution but is not guaranteed to find the best solution. Finding the absolute best solution, though, seems likely to be NP-hard.
This is isomorphic to the uncapacitated facility location problem.
card in the deck : client
vendor : possible facility location
vendor shipping rate : cost of opening a facility at a location
cost of a card with a particular vendor : "distance" from a client to a facility
Facility location is a well-studied problem in the combinatorial optimization literature.
Interesting question! :)
So if we have n cards and m vendors, the brute force approach might have to check up to n^m combinations, right (a bit less since not each vendor has each card, but I guess that doesn't really matter in the grand scheme of things ;).
Let's for a second assume each vendor has each card and then see later-on how things change if they don't.
find the cheapest one-vendor solution.
order the cards by price, find the most expensive card that's cheaper at another vendor.
for all cards from vendor 1, move them to vendor 2 if they're cheaper there.
if having added vendor 2 doesn't make the order cheaper, undo and terminate, otherwise repeat from step 2
So if one vendor doesn't have all cards, you have to start with a multi-vendor situation. For each vendor, you might start by buying all cards that exist there, then apply the algorithm to the remaining cards.
Obviously, you may not be able to exploit all subtleties in the pricing with this method. But if we assume that a large portion of the price differences is made up by individual high-price cards, I think you can find a reasonable solution with this way.
Ok after writing all this I realized, the n^m assumption is actually wrong.
Once you have chosen a set of vendors to buy from, you can simply choose the cheapest vendor for each card. This is a great advantage because the individual choices of where to buy each card don't interfere with each other.
What does this mean for our problem? From the first look of it, it means that the selection of dealers is the problem (in terms of computational complexity), not the individual allocation of your buying choices. So instead of n^m, you got 2^m possible configurations in the worst case. So what we need is a heuristic for choosing vendors rather than choosing individual cards. Which might make the heuristic from above actually even more justifiable.
I myself have pondered this. Consider the following:
If it takes you a week to figure out,
code, and debug and algorithm that
only provides a 1% discount, would you
do it?
The answer is probably "No" (unless you're spending your entire life savings on cards, in which case you may be crazy). =)... or Amazon.com
Consequently, there is already an easy approximating algorithm:
Wait until you're buying lots of cards (reduce the shipping overhead).
Buy the cards from 3 vendors:
- the two with the cheapest-but-most-diverse inventories
- a third which isn't really cheap but definitely has every card you'd want.
Optimize accordingly (for each card, buy from the cheaper one).
Also consider local vendors you could just walk to, pre-constructed decks, and trading.
Based on firsthand and second experience, I can say you will find that you can get the median price with perhaps a few dollars more shipping you could otherwise, while still getting around median on each. You will find that you may have to pay a tiny bit more for understocked cards, but this will be few and far between, and the shipping savings will make up for it.
I recall the old programming adage: "Never optimize, until it's absolutely necessary; chances are you won't need to, or would have optimized the wrong thing." (e.g. your time is a resource too, and also has monetary value)
edit: Given that, this is an amazingly cool problem and one should solve it if one has time.
my algorithm goes like this
for each card calculate the average price available i.e sum of the price available from each vendor divide by the no of vendors.
now for that card select a vendor that offers less than or equal to average price.
now for each card we will have the list of vendors. now go for the intersection this way we will end up with series of vendor providing the maximxum no of cards at average or below average price.
i'm still thinking over the next steps but im putting the rough idea over here
now we are left with cards which are providing us single card. for such cards we will look into the price list of alredy short listed vendors with max no of cards and if the price diff is less than the shipping cost the we add the card to that vendors list.
i know this will require a huge optimization. but this what i have roghly figured out hope this helps
How about this:
Calculate the average price per ordered card across all vendors.
For each vendor that has at least one of the cards, calculate the total savings for all cards in the order as the difference between each card's price at that vendor and the average price.
Start with the vendor with the highest total savings and select all of those cards from that vendor.
Continue to select vendors with the next highest total savings until you have all of the cards in the order selected. Skip vendors that don't have cards that you still need.
From the selected list of vendors, redistribute the card purchases to the vendors with the best price for that card.
From the remaining list of vendors, and if the list is small enough, you could then brute force any vendors with a low card count to see if you could move the cards to other vendors to eliminate the shipping cost.
I actually wrote this exact thing last year. The first thing I do after loading all the prices is I weed out my card pool:
Each vendor can have multiple
versions of each card, as there are
reprints. Find the cheapest one.
Eliminate any card where the card value is greater than the cheapest card+shipping combo. That is, if I can buy the card cheaper as a one-off to a vendor than I can by adding it to an existing order from your store, I will buy it from the other vendor.
Eliminate any vendor whose offering I can buy cheaper (for every card) from another vendor. Basically, if another vendor out-prices you on every card, and on the total + shipping, then you are gone.
Unfortunately, this still leaves a huge pool.
Then I do some sorting and some brute-force-depth-first summing and some pruning and eventually end up with a result.
Anyway, I tuned it up to the point that I can do 70 cards and, within a minute, get within 5% of the optimal goal. And in an hour, less than 2%. And then, a couple of days later, the actual, final result.
I am going to read more about facility planning. Thanks for that tip!
What about using genetic algorithm? I think I'll try that one myself. You might manipulate the pool by adding both a chromosome with lowest prices, and another with lowest shipping costs.
BTW, did you finally implement any of the solutions presented here? which one? why?
Cheers!
What's a good algorithm for suggesting things that someone might like based on their previous choices? (e.g. as popularised by Amazon to suggest books, and used in services like iRate Radio or YAPE where you get suggestions by rating items)
Simple and straightforward (order cart):
Keep a list of transactions in terms of what items were ordered together. For instance when someone buys a camcorder on Amazon, they also buy media for recording at the same time.
When deciding what is "suggested" on a given product page, look at all the orders where that product was ordered, count all the other items purchased at the same time, and then display the top 5 items that were most frequently purchased at the same time.
You can expand it from there based not only on orders, but what people searched for in sequence on the website, etc.
In terms of a rating system (ie, movie ratings):
It becomes more difficult when you throw in ratings. Rather than a discrete basket of items one has purchased, you have a customer history of item ratings.
At that point you're looking at data mining, and the complexity is tremendous.
A simple algorithm, though, isn't far from the above, but it takes a different form. Take the customer's highest rated items, and the lowest rated items, and find other customers with similar highest rated and lowest rated lists. You want to match them with others that have similar extreme likes and dislikes - if you focus on likes only, then when you suggest something they hate, you'll have given them a bad experience. In suggestions systems you always want to err on the side of "lukewarm" experience rather than "hate" because one bad experience will sour them from using the suggestions.
Suggest items in other's highest lists to the customer.
Consider looking at "What is a Good Recommendation Algorithm?" and its discussion on Hacker News.
There isn't a definitive answer and it's highly unlikely there is a standard algorithm for that.
How you do that heavily depends on the kind of data you want to relate and how it is organized. It depends on how you define "related" in the scope of your application.
Often the simplest thought produces good results. In the case of books, if you have a database with several attributes per book entry (say author, date, genre etc.) you can simply chose to suggest a random set of books from the same author, the same genre, similar titles and others like that.
However, you can always try more complicated stuff. Keeping a record of other users that required this "product" and suggest other "products" those users required in the past (product can be anything from a book, to a song to anything you can imagine). Something that most major sites that have a suggest function do (although they probably take in a lot of information, from product attributes to demographics, to best serve the client).
Or you can even resort to so called AI; neural networks can be constructed that take in all those are attributes of the product and try (based on previous observations) to relate it to others and update themselves.
A mix of any of those cases might work for you.
I would personally recommend thinking about how you want the algorithm to work and how to suggest related "products". Then, you can explore all the options: from simple to complicated and balance your needs.
Recommended products algorithms are huge business now a days. NetFlix for one is offering 100,000 for only minor increases in the accuracy of their algorithm.
As you have deduced by the answers so far, and indeed as you suggest, this is a large and complex topic. I can't give you an answer, at least nothing that hasn't already been said, but I an point you to a couple of excellent books on the topic:
Programming CI:
http://oreilly.com/catalog/9780596529321/
is a fairly gentle introduction with
samples in Python.
CI In Action:
http://www.manning.com/alag looks a
bit more in depth (but I've only just
read the first chapter or 2) and has
examples in Java.
I think doing a Google on Least Mean Square Regression (or something like that) might give you something to chew on.
I think most of the useful advice has already been suggested but I thought I'll just put in how I would go about it, just thinking though, since I haven't done anything like this.
First I Would find where in the application I will sample the data to be used, so If I have a store it will probably in the check out. Then I would save a relation between each item in the checkout cart.
now if a user goes to an items page I can count the number of relations from other items and pick for example the 5 items with the highest number of relation to the selected item.
I know its simple, and there are probably better ways.
But I hope it helps
Market basket analysis is the field of study you're looking for:
Microsoft offers two suitable algorithms with their Analysis server:
Microsoft Association Algorithm Microsoft Decision Trees Algorithm
Check out this msdn article for suggestions on how best to use Analysis Services to solve this problem.
link text
there is a recommendation platform created by amazon called Certona, you may find this useful, it is used by companies such as B&Q and Screwfix find more information at www.certona.com/