I am trying to calculate moving averages (simple and exponential) and I have come across the simple_statistics gem, which is perfect for my needs. I am trying to modify the code from this link: How to calculate simple moving average) for my purposes.
GOAL:
I have a JSON like this which lists historical prices for a single stock over a long time period:
[
{
"Low": 8.63,
"Volume": 14211900,
"Date": "2012-10-26",
"High": 8.79,
"Close": 8.65,
"Adj Close": 8.65,
"Open": 8.7
},
To this, I would like to add moving averages for each day (simple and exponential - which the simple_statistics gem seems to do easily) for 20, and 50 day averages (and others as required) so it would appear something like this for each day:
[
{
"Low": 8.63,
"Volume": 14211900,
"Date": "2012-10-26",
"High": 8.79,
"Close": 8.65,
"Adj Close": 8.65,
"Open": 8.7,
"SMA20":
"SMA50":
},
I would prefer to use the yahoo_finance, and simple_statistics gems and then append the output to the original JSON as I have a feeling that once I gain a better understanding, it will be easier for me to modify.
Right now, I'm still reading up on how I will do this (any help is appreciated) Below is my attempt to calculate a 20 day simple moving average for Microsoft (doesn't work). This way (using HistoricalQuotes_days) seems to assume that the start date is today, which wont work for my overall goal.
require 'rubygems'
require 'yahoofinance'
require 'simple_statistics'
averages = {}
dates.each do |dates|
quotes = YahooFinance::get_HistoricalQuotes_days ( 'MSFT' , 20 ) start, finish
closes = quotes.collect { |quote| quote.close }
averages = closes.mean
end
Thank you
UPDATE: I don't actually need to use YahooFinance gem as I already have the data in a JSON. What I dont know how to do is pull from the JSON array, make the calculations using the simple_statistics gem, and then add the new data into the original JSON.
Using the gem, I see two ways to get your data. Here they are (note they both can take a block):
YahooFinance::get_HistoricalQuotes_days('MSFT', 20)
Which returns an array of YahooFinance::HistoricalQuote objects with the following methods:
[ :recno, :recno=, :symbol, :symbol=, :date, :date=,
:open, :open=, :high, :high=, :low, :low=, :close,
:close=, :adjClose, :adjClose=, :volume, :volume=,
:to_a, :date_to_Date ]
Or:
YahooFinance::get_historical_quotes_days('MSFT', 20)
which returns an array of values from the documentation :
Getting the historical quote data as a raw array.
The elements of the array are:
[0] - Date
[1] - Open
[2] - High
[3] - Low
[4] - Close
[5] - Volume
[6] - Adjusted Close
And to take an average (simple moving average), you can easily do:
ary.reduce(:+) / ary.length
Where ary would hold the values to average (need to be floats or it will integer divide). To do the exponential moving average, just use the following formula:
(close - previous_ema) * (2 / (amount_of_days_ago + 1) ) + previous_ema
Where close is the stock's close, previous_ema is yesterday's ema, and amount_of_days_ago is the range of the average into the past, for instance 20 (days).
edit
oh. Yeah parsing json is easy: https://github.com/flori/json
I can't write a whole beginning ruby guide, but the basics for what you need are Hash and Array. Look up how to use ruby hashes and arrays, and thats probably a good 30% of ruby programming right there.
For example to get the json objects in an array and then get just the closes, you could use Array#map like so:
stocks = JSON.parse( your_json_here )
array = stocks.map{ |hash| hash["Close"] }
# => [8.65, 9.32, etc... ]
hope that gets you started n good luck
Related
Q: How do I prevent JSONata from "auto-flattening" arrays in an array constructor?
Given JSON data:
{
"w" : true,
"x":["a", "b"],
"y":[1, 2, 3],
"z": 9
}
the JSONata query seems to select 4 values:
[$.w, $.x, $.y, $.z]
The nested arrays at $.x and $.y are getting flattened/inlined into my outer wrapper, resulting in more than 4 values:
[ true, "a", "b", 1, 2, 3, 9 ]
The results I would like to achieve are
[ true, ["a", "b"], [1, 2, 3], 9 ]
I can achieve this by using
[$.w, [$.x], [$.y], $.z]
But this requires me to know a priori that $.x and $.y are arrays.
I would like to select 4 values and have the resulting array contain exactly 4 values, independent of the types of values that are selected.
There are clearly some things about the interactions between JSONata sequences and arrays that I can't get my head around.
In common with XPath/XQuery sequences, it will flatten the results of a path expression into the output array. It is possible to avoid this in your example by using the $each higher-order function to iterate over an object's key/value pairs. The following expression will get what you want without any flattening of results:
$each($, function($v) {
$v
})
This just returns the value for each property in the object.
UPDATE: Extending this answer for your updated question:
I think this is related to a previous github question on how to combine several independent queries into the same question. This uses an object to hold all the queries in a similar manner to the one you arrived at. Perhaps a slightly clearer expression would be this:
{
"1": t,
"2": u.i,
"3": u.j,
"4": u.k,
"5": u.l,
"6": v
} ~> $each(λ($v){$v})
The λ is just a shorthand for function, if you can find it on your keyboard (F12 in the JSONata Exerciser).
I am struggling to rephrase my question in such as way as to describe the difficulties I am having with JSONata's sequence-like treatment of arrays.
I need to run several queries to extract several values from the same JSON tree. I would like to construct one JSONata query expression which extracts n data items (or runs n subqueries) and returns exactly n values in an ordered array.
This example seems to query request 6 values, but because of array flattening the result array does not have 6 values.
This example explicitly wraps each query in an array constructor so that the result has 6 values. However, the values which are not arrays are wrapped in an extraneous & undesirable array. In addition one cannot determine what the original type was ...
This example shows the result that I am trying to accomplish ... I asked for 6 things and I got 6 values back. However, I must know the datatypes of the values I am fetching and explicitly wrap the arrays in an array constructor to work-around the sequence flattening.
This example shows what I want. I queried 6 things and got back 6 answers without knowing the datatypes. But I have to introduce an object as a temporary container in order to work around the array flattening behavior.
I have not found any predicates that allow me to test the type of a value in a query ... which might have let me use the ?: operator to dynamically decide whether or not to wrap arrays in an array constructor. e.g. $isArray($.foo) ? [$.foo] : $.foo
Q: Is there an easier way for me to (effectively) submit 6 "path" queries and get back 6 values in an ordered array without knowing the data types of the values I am querying?
Building on the example from Acoleman, here is a way to pass in n "query" strings (that represent paths):
(['t', 'u.i', 'u.j', 'u.k', 'u.l', 'v'] {
$: $eval('$$.' & $)
}).$each(function($o) {$o})
and get back an array ofn results with their original data format:
[
12345,
[
"i",
"ii",
"iii"
],
[],
"K",
{
"L": "LL"
},
null
]
It seems that using $each is the only way to avoid any flattening...
Granted, probably not the most efficient of expressions, since each has to be evaluated from a path string starting at the root of the data structure -- but there ya go.
I've been using Reek lately to refactor my code and one of the smells, DuplicateMethodCall, is being called on array and hash lookups, such as array[1] or hash[:key] when called multiple times.
So I was wondering if multiple array or hash lookups are so expensive, that we should be storing them in a variable rather than calling them directly, which is what everyone does from my experience.
I would not hesitate to store multiple object method calls (especially if it's a DB call) in a variable, but doing that for array and hash lookups feel like an overkill.
For example, I'll get a warning for this piece of code:
def sort_params
return [] if params[:reference_letter_section].nil?
params[:reference_letter_section].map.with_index(1) do |id, index|
{ id: id, position: index }
end
end
but I feel like storing params[:reference_letter_section] in its own variable is too much
So I was wondering if multiple array or hash lookups are so expensive
Expensive calls are not the only reason for not doing the call multiple times. It also clutters the code without real need. Consider this not-so-contrived example:
Order.new(
name: params[:order][:name],
total: params[:order][:total],
line_items: [
{
product_name: params[:order][:line_item][:product],
price: params[:order][:line_item][:price],
}
]
)
Even though those hash accesses are super-cheap, it still makes sense to extract them, for readability reasons.
order_params = params[:order]
line_item_params = order_params[:line_item]
Order.new(
name: order_params[:name],
total: order_params[:total],
line_items: [
{
product_name: line_item_params[:product],
price: line_item_params[:price],
}
]
)
The duplicate hash lookup represents coupling between those two lines of code. This can increase the time taken to understand the code, and can be a source of friction when changing it. Of course, within a small method like this the cost is relatively low; but if the two lines of code were further apart -- in different classes, for example -- the effects of the coupling would be much more costly.
Here's a version of your method that doesn't have the duplication:
def sort_params
reference_letters = params[:reference_letter_section] || []
reference_letters.map.with_index(1) do |id, index|
{ id: id, position: index }
end
end
I tried looking for some similar cases here but was unsuccessful.
I want the user to first introduce an amount of money, then he would need to state a number of days: "30", "45" or "60", not other options rather than these, and finally, the program would multiply the amount of money per a fixed number. This number depends on the number of days the user has chosen. If he choses 30, "amount of money * 1.0219"; 45: "amount of money * 1.0336"; 60: "amount of money * 1.0467".
So far, this is the code I wrote:
puts "Indicate money to invest"
money_invested = gets.to_i
puts "Indicate time of investment"
time_investment = gets.to_i
investment_calculation = { 30 => 1.0219, 45 => 1.0336, 60 => 1.0467 }
# will be `nil` if not one of the defined ones
I understand that there is something horrible going on within the hash already, so I decided to stop there. I'm not sure if => means what I want it to mean, i.e.: multiply.
You could store the 'fixed values' in a hash and access them depending on the given user-input. E.g.:
fixed_values = { "30" => 1.0219,
"45" => 1.0336,
"60" => 1.0467 }
Then you have to multiply it with the * operator:
investment_calculation = money_invested * fixed_values[time_investment]
Of course, you should check if the 'number of days', the user types in, is available in your hash, in order to avoid errors or misbehaviour.
Even though you question is very basic, is well written and easy to understand. Good job!
Now follow me: a Hash, also known as a "Map" in some other non-Ruby languages, is a data-structure that allows us to associate keys with values. Keys in hashes are the equivalent of indexes in arrays: the main purpose of arrays is to store and access elements by using indexes, the main purpose of hashes is to store and access elements by keys. Hashes are really fast when compared against other naive implementations of a generic key-value data structure.
In this example, you are associating the key 30 to the value 1.0219. You can evaluate the behaviour of your hash in the console using the irb command:
> investment_calculation = { 30 => 1.0219, 45 => 1.0336, 60 => 1.0467 }
=> {30=>1.0219, 45=>1.0336, 60=>1.0467}
> investment_calculation[30]
=> 1.0219
> investment_calculation[45]
=> 1.0336
> investment_calculation[100]
=> nil
In comparison, you could solve the problem using an array with sparse values, i.e., with many values set to 0/nil, but its memory-inefficient as most of the memory is used to store nothing.
As a total beginner of programming, I am trying to filter a JSON file for my master's thesis at university. The file contains approximately 500 hashes of which 115 are the ones I am interested in.
What I want to do:
(1) Filter the file and select the hashes I am interested in
(2) For each selected hash, return only some specific keys
The format of the array with the hashes ("loans") included:
{"header": {
"total":546188,
"page":868,
"date":"2013-04-11T10:21:24Z",
"page_size":500},
"loans": [{
"id":427853,
"name":"Peter Pan",
...,
"status":"expired",
"paid_amount":525,
...,
"activity":"Construction Supplies",
"sector":"Construction"," },
... ]
}
Being specific, I would like to have the following:
(1) Filter out the "loans" hashes with "status":"expired"
(2) Return for each such "expired" loan certain keys only: "id", "name", "activity", ...
(3) Eventually, export all that into one file that I can analyse in Excel or with some stats software (SPSS or Stata)
What I have come up with myself so far is this:
require 'rubygems'
require 'json'
toberead = File.read('loans_868.json')
another = JSON.parse(toberead)
read = another.select {|hash| hash['status'] == 'expired'}
puts hash
This is obviously totally incomplete. And I feel totally lost.
Right now, I don't know where and how to continue. Despite having googled and read through tons of articles on how to filter JSON...
Is there anyone who can help me with this?
The JSON will be parsed as a hash, 'header' is one key, 'loans' is another key.
so after your JSON.parse line, you can do
loans = another['loans']
now loans is an array of hashes, each hash representing one of your loans.
you can then do
expired_loans = loans.select {|loan| loan['status'] == 'expired'}
puts expired_loans
to get at your desired output.
Given a json array:
[{ "x":"5", "y":"20" },{ "x":"6", "y":"10" },{ "x":"50", "y":"5" }]
I'd like to find argmax(x), such that I can do puts argmax(arr, :arg => "x").y and get 5. How can I elegantly implement this in Ruby?
Edit: Clarified a bit. The idea is that you can specify the field of an element in a list that you want to maximize and the method will return the maximizing element.
I think you want Enumerable#max_by. To get y like you're saying, it would be:
arr.max_by {|hash| hash['x']}['y']
(Well, actually, you'll want the numbers to be numbers instead of strings, since '50' sorts lower than '6'. But I think you get the idea. You can to_i or do whatever processing you need in the block to get the "real" value to sort by.)