How can i use the functions of "survey" package in "expss" package in R

How can i use the functions of "survey" package in "expss" package in R - survey

I try to use the expss packages for survey data analysis, but the result of standard errors, variances and confidence intervals differ of the survey package result.
In survey:
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
svyby(~api99, ~stype, dclus1, svymean)
stype api99 se
E E 607.7917 22.81660
H H 595.7143 41.76400
M M 608.6000 32.56064
In expss:
apiclus1 %>% tab_cells(api99) %>%
tab_rows(stype) %>% tab_weight(pw) %>%
tab_stat_fun(w_mean,w_se) %>% tab_pivot()
| | | | | #Total |
| ----- | -- | ----- | ------ | ------ |
| stype | E | api99 | mean | 607.8 |
| | | | se | 1.6 |
| | H | api99 | mean | 595.7 |
| | | | se | 4.7 |
| | M | api99 | mean | 608.6 |
| | | | se | 3.7 |
How can i use the functions of survey package within expss?

Related

Elixir: to assign variable in for generator(variable scope?)

I'm solving, find the largest prime factor of the number, Project Euler problem3.
Following Elixir code throw warnings, and do not evaluate in if block(assigning) I think:
num = 13195
range = num
|> :math.sqrt
|> Float.floor
|> round
for dv <- 2..range do
if rem(num, dv) == 0 and div(num, dv) != 1 do
num = div(num, dv)
end
end
num
|> IO.puts
Warnings are:
$ elixir 3.exs
warning: variable "num" is unused
3.exs:10
warning: the result of the expression is ignored (suppress the warning by assigning the expression to the _ variable)
3.exs:10
13195
$ elixir -v
Erlang/OTP 20 [erts-9.2] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]
Elixir 1.5.3
How can I update(assign) the num?
(following Python and Javascript codes are work for same the problem):
# 3.py
from math import ceil, sqrt
num = 600851475143
for div in range(2, ceil(sqrt(num)) + 1):
if num%div == 0 and num/div != 1:
num /= div
assert int(num) == 6857
// 3.js
var num = 600851475143;
var range = Array.from({length: Math.trunc(Math.sqrt(num))}, (x, i) => i + 2)
for (const div of range) {
if (num%div === 0 && num/div != 1) {
num /= div;
}
}
var assert = require('assert');
assert(num === 6857)

You are actually creating a new variable and shadowing the one from outer scope
You can rewrite it like this
num = 13195
range =
num
|> :math.sqrt()
|> Float.floor()
|> round
num =
2..range
|> Enum.reduce(num, fn elem, acc ->
if rem(acc, elem) == 0 and div(acc, elem) != 1 do
div(acc, elem)
else
acc
end
end)
IO.puts num
More on shadowing:
+------------------------------------------------------------+
| Top level |
| |
| +------------------------+ +------------------------+ |
| | Module | | Module | |
| | | | | |
| | +--------------------+ | | +--------------------+ | |
| | | Function clause | | | | Function clause | | |
| | | | | | | | | |
| | | +----------------+ | | | | +----------------+ | | |
| | | | Comprehension | | | | | | Comprehension | | | |
| | | +----------------+ | | | | +----------------+ | | |
| | | +----------------+ | | ... | | +----------------+ | | |
| | | | Anon. function | | | | | | Anon. function | | | |
| | | +----------------+ | | | | +----------------+ | | |
| | | +----------------+ | | | | +----------------+ | | |
| | | | Try block | | | | | | Try block | | | |
| | | +----------------+ | | | | +----------------+ | | |
| | +--------------------+ | | +--------------------+ | |
| +------------------------+ +------------------------+ |
| |
+------------------------------------------------------------+
Any variable in a nested scope whose name coincides with a variable from the surrounding scope will shadow that outer variable. In other words, the variable inside the nested scope temporarily hides the variable from the surrounding scope, but does not affect it in any way.
source

Special sliding puzzle - Algorithm to find minimum distance

I just stumbled upon a strange (and very annoying game) that I wanted to solve programmatically. It reminds a bit of Rubik's cube, but 2 dimensional. I'm struggling a bit on how to approach this...
There is a 9x9 square with some circles placed into the inner squares. For instance, one get's the following picture:
A B C D E F G H I
-------------------------------------
9 | | | O | | | O | | | | J
-------------------------------------
8 | | | O | | O | | O | | | K
-------------------------------------
7 | | | | O | | | O | O | | L
-------------------------------------
6 | | | O | | | | O | | | M
-------------------------------------
5 | | | O | | | | | | | N
-------------------------------------
4 | | | | O | | O | O | | | O
-------------------------------------
3 | | | | | O | | O | | | P
-------------------------------------
2 | | | | O | | | | | | Q
-------------------------------------
1 | | | O | | | | | | | R
-------------------------------------
0 Z Y X W V U T S
One can use the numbers and letters arround the square to shift entire "rows" or "columns" to either left/right or up/down. Circles that would leave the game area to the right would reappear on the left and vise-versa, same accounts for top/bottom.
The goal is to rearrange the circles to a given pattern with a maximum amount of moves. For instance, one should rearrange the circles in the above picture to reflect the below picture in maximum 17 moves:
A B C D E F G H I
-------------------------------------
9 | | | | | | | | | | J
-------------------------------------
8 | | | O | O | O | O | O | | | K
-------------------------------------
7 | | | O | | | | O | | | L
-------------------------------------
6 | | | O | | | | O | | | M
-------------------------------------
5 | | | O | | | | O | | | N
-------------------------------------
4 | | | O | | | | O | | | O
-------------------------------------
3 | | | O | O | O | O | O | | | P
-------------------------------------
2 | | | | | | | | | | Q
-------------------------------------
1 | | | | | | | | | | R
-------------------------------------
0 Z Y X W V U T S
I would like to feed the starting and the end position of the circles to a program that delivers the shortest path possible. I'm struggling a bit to find an approach that doesn't just try all possible moves until a given maximum number of moves is reached.
Also it doesn't seem to be that easy to modify the approach that's being used to solve a Rubik's cube for instance...
Well, I thought it was a very interesting problem, and maybe somebody here has an illuminating idea.
UPDATE:
Just trying all the possible moves doesn't really seem realistic after a first try. There are just too many permutations. I think this could be really hard to solve...if possible at all.

Slow aggregation on big neo4j graph

Configuration:
Windows 8.1
neo4j-enterprise-2.2.0-M03
cache type: hpc
8Gb RAM
6Gb for JVM Heap (wrapper.java.initmemory=6144 wrapper.java.maxmemory=6144)
5Gb out of 6Gb of JVM Heap for mapped memory (dbms.pagecache.memory=5G)
Model:
Model represents how users navigate through website.
27 522 896 nodes (394Mb)
111 294 796 relationships (3609Mb)
33 906 363 properties (1326Mb)
293 (:Page) nodes
27522603 (:PageView) nodes
0 (:User) nodes (not load yet)
each (:PageView) node connected with (:Page) node
each (:PageView) node connected with next (:PageView) node
each (:PageView) node connected with (:User) node (not yet)
Query
match (:Page {Name:'#########.aspx'})<-[:At]-(:PageView)-[:Next]->(:PageView)-[:At]->(p:Page)
return p.Name,count(*) as count
order by count desc
limit 10;
Profile info:
+------------------------------------------------+
| p.Name | count |
+------------------------------------------------+
| "#####################.aspx" | 5172680 |
| "###############.aspx" | 3846455 |
| "#########.aspx" | 3579022 |
| "###########.aspx" | 3051043 |
| "#############################.aspx" | 1713004 |
| "############.aspx" | 1373928 |
| "############.aspx" | 1338063 |
| "#####.aspx" | 1285447 |
| "###################.aspx" | 884077 |
| "##############.aspx" | 759665 |
+------------------------------------------------+
10 rows
195363 ms
Compiler CYPHER 2.2
Planner COST
Projection(0)
|
+Top
|
+EagerAggregation
|
+Projection(1)
|
+Filter(0)
|
+Expand(All)(0)
|
+Filter(1)
|
+Expand(All)(1)
|
+Filter(2)
|
+Expand(All)(2)
|
+NodeUniqueIndexSeek
+---------------------+---------------+----------+----------+-------------------------------------------+--------------------------------------------------+
| Operator | EstimatedRows | Rows | DbHits | Identifiers | Other |
+---------------------+---------------+----------+----------+-------------------------------------------+--------------------------------------------------+
| Projection(0) | 881 | 10 | 0 | FRESHID105, FRESHID110, count, p.Name | p.Name, count |
| Top | 881 | 10 | 0 | FRESHID105, FRESHID110 | { AUTOINT1}; |
| EagerAggregation | 881 | 173 | 0 | FRESHID105, FRESHID110 | |
| Projection(1) | 776404 | 35941815 | 71883630 | FRESHID105, p | |
| Filter(0) | 776404 | 35941815 | 35941815 | p | (NOT(anon[38] == anon[78]) AND hasLabel(p:Page)) |
| Expand(All)(0) | 776404 | 35941815 | 49287436 | p | ()-[:At]->(p) |
| Filter(1) | 384001 | 13345621 | 13345621 | | hasLabel(anon[67]:PageView) |
| Expand(All)(1) | 384001 | 13345621 | 19478500 | | ()-[:Next]->() |
| Filter(2) | 189923 | 6132879 | 6132879 | | hasLabel(anon[46]:PageView) |
| Expand(All)(2) | 189923 | 6132879 | 6132880 | | ()<-[:At]-() |
| NodeUniqueIndexSeek | 1 | 1 | 1 | | :Page(Name) |
+---------------------+---------------+----------+----------+-------------------------------------------+--------------------------------------------------+
Total database accesses: 202202762
Query without unnecessary labels
match (:Page {Name:'Dashboard.aspx'})<-[:At]-()-[:Next]->()-[:At]->(p)
return p.Name,count(*) as count
order by count desc
limit 10;
Profile info:
+------------------------------------------------+
| p.Name | count |
+------------------------------------------------+
| "#####################.aspx" | 5172680 |
| "###############.aspx" | 3846455 |
| "#########.aspx" | 3579022 |
| "###########.aspx" | 3051043 |
| "#############################.aspx" | 1713004 |
| "############.aspx" | 1373928 |
| "############.aspx" | 1338063 |
| "#####.aspx" | 1285447 |
| "###################.aspx" | 884077 |
| "##############.aspx" | 759665 |
+------------------------------------------------+
10 rows
166751 ms
Compiler CYPHER 2.2
Planner COST
Projection(0)
|
+Top
|
+EagerAggregation
|
+Projection(1)
|
+Filter
|
+Expand(All)(0)
|
+Expand(All)(1)
|
+Expand(All)(2)
|
+NodeUniqueIndexSeek
+---------------------+---------------+----------+----------+-----------------------------------------+---------------------------+
| Operator | EstimatedRows | Rows | DbHits | Identifiers | Other |
+---------------------+---------------+----------+----------+-----------------------------------------+---------------------------+
| Projection(0) | 881 | 10 | 0 | FRESHID82, FRESHID87, count, p.Name | p.Name, count |
| Top | 881 | 10 | 0 | FRESHID82, FRESHID87 | { AUTOINT1}; |
| EagerAggregation | 881 | 173 | 0 | FRESHID82, FRESHID87 | |
| Projection(1) | 776388 | 35941815 | 71883630 | FRESHID82, p | |
| Filter | 776388 | 35941815 | 0 | p | NOT(anon[38] == anon[60]) |
| Expand(All)(0) | 776388 | 35941815 | 49287436 | p | ()-[:At]->(p) |
| Expand(All)(1) | 383997 | 13345621 | 19478500 | | ()-[:Next]->() |
| Expand(All)(2) | 189923 | 6132879 | 6132880 | | ()<-[:At]-() |
| NodeUniqueIndexSeek | 1 | 1 | 1 | | :Page(Name) |
+---------------------+---------------+----------+----------+-----------------------------------------+---------------------------+
Total database accesses: 146782447
Message.log
Question
How can I perform this query much faster? (more RAM, refactor query, distributed cache, use another language/shell/method, ...)
UPD:
Profile info for last query in answer
neo4j-sh (?)$ profile match (:Page {Name:'Dashboard.aspx'})<-[:At]-()-[:Next]->()-[:At]->(p)
with p,count(*) as count
order by count desc
limit 10 return p.Name, count;
+------------------------------------------------+
| p.Name | count |
+------------------------------------------------+
| "OutgoingDocumentsList.aspx" | 5172680 |
| "DocumentPreview.aspx" | 3846455 |
| "Dashboard.aspx" | 3579022 |
| "ActualTasks.aspx" | 3051043 |
| "DocumentFillMissingRequisites.aspx" | 1713004 |
| "EditDocument.aspx" | 1373928 |
| "PaymentsList.aspx" | 1338063 |
| "Login.aspx" | 1285447 |
| "ReportingRequisites.aspx" | 884077 |
| "ContractorInfo.aspx" | 759665 |
+------------------------------------------------+
10 rows
151328 ms
Compiler CYPHER 2.2
Planner COST
Projection
|
+Top
|
+EagerAggregation
|
+Filter
|
+Expand(All)(0)
|
+Expand(All)(1)
|
+Expand(All)(2)
|
+NodeUniqueIndexSeek
+---------------------+---------------+----------+----------+------------------+---------------------------+
| Operator | EstimatedRows | Rows | DbHits | Identifiers | Other |
+---------------------+---------------+----------+----------+------------------+---------------------------+
| Projection | 881 | 10 | 20 | count, p, p.Name | p.Name, count |
| Top | 881 | 10 | 0 | count, p | { AUTOINT1}; count |
| EagerAggregation | 881 | 173 | 0 | count, p | p |
| Filter | 776388 | 35941815 | 0 | p | NOT(anon[38] == anon[60]) |
| Expand(All)(0) | 776388 | 35941815 | 49287436 | p | ()-[:At]->(p) |
| Expand(All)(1) | 383997 | 13345621 | 19478500 | | ()-[:Next]->() |
| Expand(All)(2) | 189923 | 6132879 | 6132880 | | ()<-[:At]-() |
| NodeUniqueIndexSeek | 1 | 1 | 1 | | :Page(Name) |
+---------------------+---------------+----------+----------+------------------+---------------------------+
Total database accesses: 74898837

As I mentioned before, in your other question, if you can write a Java based server extension you can do it pretty easily.
// initialize counters
Map<Node,AtomicInteger> pageCounts = new HashMap<>(300);
for (Node page : graphDb.findNode(Page)) pageCounts.put(page,new AtomicInteger());
// find start page
Label Page = DynamicLabel.label("Page");
Node page = graphDB.findNode(Page,"Name",pageName).iterator().next();
// follow page-view relationships
for (Relationship at : page.getRelationships(At, INCOMING)) {
// follow singular next relationship
Relationship at2 = at.getStartNode().getSingleRelationship(Next,OUTGOING);
if (at2==null) continue;
// follow singular page-view relationship to end-page
Node page2 = at2.getSingleRelationship(At,OUTGOING).getEndNode();
// increment counter
pageCounts.get(page2).incrementAndGet();
}
// sort pages by count descending
List pages = new ArrayList(pageCounts.entrySet())
Collections.sort(pages,new Comparator<Map.Entry<Node,Integer>>() {
public int compare(Map.Entry<Node,Integer> e1, Map.Entry<Node,Integer> e2) {
return - Integer.compare(e1.getValue(),e2.getValue());
}
});
// return top 10
return pages.subList(0,10);
For Cypher I would try something like this:
match (:Page {Name:'#########.aspx'})<-[:At]-(pv:PageView)
WITH distinct pv
MATCH (pv)-[:Next]->(pv2:PageView)
with distinct pv2
match (pv2)-[:At]->(p:Page)
return p.Name,count(*) as count
order by count desc
limit 10;
Update
I wrote a test for it and ran it on my bigger linux machine, the results there are much more sensible: between 1.6s in Java and 5s max in Cypher.
Here is the code and the results: https://gist.github.com/jexp/94f75ddb849f8c41c97c
In Cypher:
-------------------
match (:Page {Name:'Page1'})<-[:At]-()-[:Next]->()-[:At]->(p)
return p.Name,count(*) as count
order by count desc
limit 10;
+-------------------+
| p.Name | count |
+-------------------+
| "Page169" | 975 |
| "Page125" | 959 |
| "Page106" | 955 |
| "Page274" | 951 |
| "Page176" | 947 |
| "Page241" | 944 |
| "Page30" | 942 |
| "Page44" | 938 |
| "Page1" | 938 |
| "Page118" | 938 |
+-------------------+
10 rows
in 3212 ms
[Compiler CYPHER 2.2
Planner COST
+---------------------+---------------+--------+--------+--------------------------+---------------------------+
| Operator | EstimatedRows | Rows | DbHits | Identifiers | Other |
+---------------------+---------------+--------+--------+--------------------------+---------------------------+
| Top | 488 | 10 | 0 | FRESHID71, FRESHID76 | { AUTOINT1}; |
| EagerAggregation | 488 | 300 | 0 | FRESHID71, FRESHID76 | |
| Projection | 238460 | 264828 | 529656 | FRESHID71, p | |
| Filter | 238460 | 264828 | 0 | p | NOT(anon[29] == anon[51]) |
| Expand(All)(0) | 238460 | 264828 | 529656 | p | ()-[:At]->(p) |
| Expand(All)(1) | 238460 | 264828 | 778522 | | ()-[:Next]->() |
| Expand(All)(2) | 476922 | 513694 | 513695 | | ()<-[:At]-() |
| NodeUniqueIndexSeek | 1 | 1 | 1 | | :Page(Name) |
+---------------------+---------------+--------+--------+--------------------------+---------------------------+
Total database accesses: 2351530]
And in Java:
-------------------
Java took 1618 ms
Node[169]=975
Node[125]=959
Node[106]=955
Node[274]=951
Node[176]=947
Node[241]=944
Node[30]=942
Node[1]=938
Node[44]=938
Node[118]=938
Something you can also do to speed up your Cypher query, is to only aggregate on the nodes, and only return the page.Name property for the last 10 rows, much faster.
match (:Page {Name:'Page1'})<-[:At]-()-[:Next]->()-[:At]->(p)
with p,count(*) as count
order by count desc
limit 10 return p.Name, count

Creating a linkages map

I have an interesting programming problem that I need to solve for a iPhone app that I am currently building. The problem is actually a logic problem that does not need to be specific to any particular programming language.
The app needs to produce a linkages map (apologies if this isn't the right terminology but it makes sense to me). You have the following data:
A=C
B=A
C=O
D=F
E=F
F=G
G=D
H=J
I=L
J=N
K=A
L=O
M=C
N=H
O=E
The letters A through to O can be linked to any other letter. The app needs to follow the links to create a map, so starting with A, A link to C, C link to O, O links to E, E links to F etc
When complete this map would look like the attached photo.
http://i.stack.imgur.com/TEfAs.jpg
The problem I have is that I need to write code that will output any map using any combination of links. So for example another link list might look like
A=B
B=A
C=A
D=A
E=A
F=A
G=A
H=A
I=A
J=A
K=A
L=A
M=A
N=A
O=A
I can't get my head around the pseudocode / logic for drawing the app. There are always 15 letters A-O and a letter can never be linked to itself so A can never = A.
Can anyone help to come up with the logic for drawing the map?

What you want is to draw a graph. There is no canonical graphical representation of a graph. So if you have no constrains how the graph should be drawn, you can simply make a row of the Letters and than draw arches between the letters according to your map,
Little like this (ASCII-ART):
Example
+-----------------------------------------+
+--------------------------------------+ |
+-----------------------------------+ | |
+--------------------------------+ | | |
+-----------------------------+ | | | |
+--------------------------+ | | | | |
+-----------------------+ | | | | | |
+--------------------+ | | | | | | |
+-----------------+ | | | | | | | |
+--------------+ | | | | | | | | |
+-----------+ | | | | | | | | | |
+--------+ | | | | | | | | | | |
+-----+ | | | | | | | | | | | |
| | | | | | | | | | | | | |
A B C D E F G H I J K L M N O
| |
+--+
Example
+-----------------------------+
+-----------------------------+ |
+--+ +-----------------------------------+
| | | | +--------+ | |
A B C D E F G H I J K L M N O
| | | | | | | | | | | |
+-----+ | +--+ | +-----+ | +--------+
| +-----+ | | +-----------+
| | +--+ +-----------------+
| +--------+ |
+-----------------------------+
Look a bit confusing, but you cannot always avoid crossings. [In this example you could, but I did not try to avoid crossing, because they cannot be avoided in the general case.]

Need help with credit expiration algorithm

So I'm stuck. I am working on a credit system with expirations. Similar to credit card miles but not exactly. By the way I am sorry for the book ahead but I needed to add enough detail to help get the whole picture.
What I need is a system where a user accumulates credits for doing activities. But they can also spend these credits on activities. The credits should expire after 30 days if they are not used. I seem to be stuck on how to accurately calculate this in a batch that will run every night. Any ideas in any language would be greatly appreciated as I seem to be stuck on just one minor detail that I can't get around. Here is an example of the data:
7/1: +5 - user signs up
7/2: +5 - user interacts with system
7/2: -3 - user purchases activity
7/3: +5 - user interacts with system
So at this point the user has received 15 credits and has spent 3. Leaving him with a total of 12 credits. (At least I got basic math down :P)
I should add that currently we are playing with the idea of having two fields: last processed, next processed. So these values at this time assuming it was a new sign up are:
Last Processed Date: 7/1
Next Process Date: 8/1
So now 8/1 comes around. The batch starts and looks at all credits that are older than 30 days. Which at this point is 5.
This is where it starts to get fuzzy.
Then the system should look at all the credits that have been spent in the last 30 days to see if they are using any credits. Because they should only expire if they haven't been used. So there are 3. So I then deduct the user 2 credits because that is the difference of credits earned older than 30 days and what has been spent. So I finish the batch and set the dates accordingly for the next day. Now assuming they haven't spent anymore I start the calculation over of credits earned older than 30, which is 5 and credits spent which again is 3. But I obviously don't want to consider the 3 credits that I considered yesterday. What is a good approach to not include those 3 credits again for consideration.
That is where I am stuck.
We are thinking about writing a debit record for the expired credits so we can track them but having a hard time seeing how I can use it in this calculation.
If you read this far thank you. If you even make a somewhat effort in the answer I will at a minimum give you an up vote for effort.
EDIT:
Ok #Greg mentioned something that I forgot to address. The idea of putting a flag on the credits considered. A valid point but not one that can work because of the following scenario:
Let's say that on a particular day a user spends 10 credits. But the expired credits that the batch is considering only accumulated to 5. Well he should still have 5 more credits left over to not have expired because he spent more than a single expiration. So the flag wouldn't work because we would have skipped those 5 extra credits. Hope that makes sense?

For every user of the system keep an array, that stores information about the amount of credits available to the user for the next 30 consecutive days
For example the data for some user might look like this
8 |
7 | |
6 | | | |
5 | | | | | | | | | | |
4 | | | | | | | | | | | | | | | | |
3 | | | | | | | | | | | | | | | | | | | | | | | |
2 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
1 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
-------------------------------------------------------------
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
^ ^ ^
| \_ |
today tomorrow in 15 days
Every time the user earns some credits, You increase amounts for all days by the number of credits earned. For example if the user earns 2 credits the table changes as follows. It's like rising the whole graph up.
10 |
9 | |
8 | | | |
7 | | | | | | | | | | |
6 | | | | | | | | | | | | | | | | |
5 | | | | | | | | | | | | | | | | | | | | | | | |
4 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
3 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
2 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
1 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
-------------------------------------------------------------
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
^ ^ ^
| \_ |
today tomorrow in 15 days
If The user has x credits today and spends y credits, You decrease the amount of credits available to him to x - y, for every day he has an amount greater than x - y. For days he has no more than x - y, the amount stays the same. It's like cutting the top of the graph off. For example if the user spends 3 credits the graph changes to
7 | | | | | | | | | | |
6 | | | | | | | | | | | | | | | | |
5 | | | | | | | | | | | | | | | | | | | | | | | |
4 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
3 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
2 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
1 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
-------------------------------------------------------------
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
^ ^ ^
| \_ |
today tomorrow in 15 days
Every day You shift the graph to the left to model expiring credits. The user will have the following amounts tomorrow
7 | | | | | | | | | |
6 | | | | | | | | | | | | | | | |
5 | | | | | | | | | | | | | | | | | | | | | | |
4 | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
3 | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
2 | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
1 | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
-------------------------------------------------------------
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
^ ^ ^
| \_ |
today tomorrow in 15 days

I wouldn't consider trying to process the data as you present it. Instead, you should keep track of how many credits the user has, and when they expire. That way you keep track of which credits were used when the purchase is made, instead of trying to work it all out later.
So when the user signs up, they have:
5 credits expiring on 8/1
After interacting with the system the next day:
5 credits expiring on 8/1
5 credits expiring on 8/2
After purchasing something:
2 credits expiring on 8/1
5 credits expiring on 8/2
And so on.

Assuming you run this batch on a daily basis, you can have a table that keeps track of all the credits they earned, and the credits they used (negative credits).
At the beginning of the next month, your job is simply to find out which of the credits earned on the first day were not spent during the month.
The number of credits earned on the first day - the credits they spent all of last month. If the number is positive, they have some credits that need to expired. So simple add a record in the table with a negative credit. This will zero-out the unused credits.
The next day, repeat the process by seeing how many credits they earned on the second day minus the sum of all the credits they earned in the last month, taking into account the record with the negative credits you created the previous day.

How about adding a flag to the expenditures? If the flag is not set, then you can include that expenditure in the batch, if necessary. If you do use the expenditure to offset an expiration, then you set the flag. Next time through, you'll ignore that expenditure because the flag is set.

Use a debit record to record normal expenditures. When the monthly batch job runs, it can calculate the total debits which are less than or equal to the expiring credits. If there are credits to expire, simply insert an appropriate debit record (appropriate == to cancel the excess, in your application). In this way, any 'running total' code which examines only credits and debits will reach the same balance that your batch code intended.

One approach to this problem is to store only the transactions, not the balance. Then you always calculate the balance in real time when needed. Here's the data:
Date : Amount : Expiries
7/1 : +5 : 7/31
7/2 : +5 : 8/1
7/2 : -3 : never
7/3 : +5 : 8/2
The balance at any time is simply the total of all transactions that have not yet expired. No need to run any batch processes.

Regarding Julians reply (that I can't comment to yet), I'm dealing with just the same problem and Julians approach won't work because that would result the account being able to go negative.
If the user didn't use the service for one month, on 8/4 the account balance would be -3 and one activity worth of 5 would bring the balance to 2, not to 5 as it should.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio