Node comparison using xpath - xpath

I have the following xml document (only a digest to show the structure, as it is quite long):
<document>
<article>
<head>ISA Savers Could Lose Tax Relief</head>
<text>
<paragraph>Tax relief could be withdrawn from tens of thousands of Individual Savings Account, sheltering up to £500 million from tax, because some banks and building societies are flouting not only the spirit, but the letter of the laws governing them.</paragraph>
<paragraph>This will come as alarming news to thousands of savers who invested in fixed rate Isas which they thought would offer them security combined with a tax shelter The Inland Revenue is writing to all Britain's savings institutions reminding them of the basic Isa rules and warning that tax relief will be withdrawn if these are manipulated in anyway.</paragraph>
<paragraph>Furthermore, the Revenue will castigate some deposit-takers for already breaching the rules and demand they speedily rewrite their terms and conditions, or face loss of tax-relief.</paragraph>
<paragraph>The guardian of the nations coffers is known to be deeply disturbed at the rapid recent growth of fixed- rate Isas which lock savers in for several years, maintaining these are unlawful. It is sending a stark reminder to all savings institutions that customers must be allowed to withdraw their cash, and requiring those which don't, to address the violation.</paragraph>
<paragraph>At the front of the firing line are Bradford & Bingley, Liverpool Victoria, and Julian Hodge Bank, which attracted huge inflows of money because they have been consistently among the best buys. All three specifically prohibit withdrawals. The Ipswich Building Society joined them last week, becoming the latest to offer a fixed- rate Isa, outlawing withdrawals until the end of the term.</paragraph>
<paragraph>A written statement from the Revenue said:"When ISAs were introduced Ministers made it clear that one of the their main objectives was to encourage non-savers to start saving, and people with small amounts saved, to save more. It was therefore important that ISAs would not lock in savers' money as this would exclude people with limited resources who might need access to their money quickly.</paragraph>
<paragraph>"We do not consider that a product whose terms and conditions actually prevented withdrawals or transfers during a fixed term would be consistent with Ministers' intentions or the statutory rules."</paragraph>
<paragraph>So serious are the Revenue's concerns that they no longer trust banks and building societies to monitor themselves. They plan to introduce a new requirement which insists institutions submit full details of all Isa accounts to them for approval, before they becomes available to the public. Until now, banks merely had to satisify themselves they were not breaking the rules.</paragraph>
<paragraph>Paul James, marketing manager of Julian Hodge Bank, said the fixed accounts were launched in response to customer demand.</paragraph>
<paragraph>He said:"People were asking for a fixed rate, and those who have taken them out over the past couple of years have done well. We understood that Isas had to be transferable on notice, which we took to be the end of the fixed period.</paragraph>
<paragraph>"We were audited by the Revenue last year, and the issue was not raised at that stage, so we thought everything was OK. If we have been successfully audited then the accounts can't be otherwise but OK surely?"</paragraph>
<paragraph>Nigel Snell, a spokesman for Liverpool Victoria said that its range of five Isas prohibiting withdrawals did include a proviso in the small print allowing access to funds "in exceptional circumstances and by permission of the trustees".</paragraph>
<paragraph>He added:"We make it clear to people that their money is locked away, but thought the account was compliant because of this clause in the small print."</paragraph>
<paragraph>An Ipswich spokesman said it was aware there was some ambiguity when its account was launched last week. However, it acknowledged it may have to review its terms and conditions again, after seeing the letter from the Revenue.</paragraph>
</text>
<date>
<day>10</day>
<month>05</month>
<year>2002</year>
</date>
<source>Sexymoney</source>
<portal>Finance</portal>
<ID number="2610981009.98103"/>
</article>
.
.
.
I have to find all nodes in each article that occur after the head node and before the portal node.
I don't know how to iterate through all the child nodes and grandchild nodes of the article node and compare it with the head node and portal node of the same article node.
Here is one of my attempts to solve the xpath query (which does obviously not work, because I don't know how to iterate through all of the nodes for comparison):
for $x in //article return $x[$x/text >> $x/head and $x/text << $x/portal]//*
Thanks in advance for your answers

I. XPath 2.0 solution:
Use:
/*/article/*[. >> ../head and ../portal >> .]
II. XPath 1.0 solution (it is also an XPath 2.0 solution):
/*/article/head/following-sibling::*[following-sibling::portal]
III. XSLT - based verification:
This XSLT 2.0 transformation:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:sequence select="/*/article/*[. >> ../head and ../portal >> .]"/>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<document>
<article>
<head>ISA Savers Could Lose Tax Relief</head>
<text>
<paragraph>Tax relief could be withdrawn from tens of thousands of Individual Savings Account, sheltering up to £500 million from tax, because some banks and building societies are flouting not only the spirit, but the letter of the laws governing them.</paragraph>
<paragraph>This will come as alarming news to thousands of savers who invested in fixed rate Isas which they thought would offer them security combined with a tax shelter The Inland Revenue is writing to all Britain's savings institutions reminding them of the basic Isa rules and warning that tax relief will be withdrawn if these are manipulated in anyway.</paragraph>
<paragraph>Furthermore, the Revenue will castigate some deposit-takers for already breaching the rules and demand they speedily rewrite their terms and conditions, or face loss of tax-relief.</paragraph>
<paragraph>The guardian of the nations coffers is known to be deeply disturbed at the rapid recent growth of fixed- rate Isas which lock savers in for several years, maintaining these are unlawful. It is sending a stark reminder to all savings institutions that customers must be allowed to withdraw their cash, and requiring those which don't, to address the violation.</paragraph>
<paragraph>At the front of the firing line are Bradford & Bingley, Liverpool Victoria, and Julian Hodge Bank, which attracted huge inflows of money because they have been consistently among the best buys. All three specifically prohibit withdrawals. The Ipswich Building Society joined them last week, becoming the latest to offer a fixed- rate Isa, outlawing withdrawals until the end of the term.</paragraph>
<paragraph>A written statement from the Revenue said:"When ISAs were introduced Ministers made it clear that one of the their main objectives was to encourage non-savers to start saving, and people with small amounts saved, to save more. It was therefore important that ISAs would not lock in savers' money as this would exclude people with limited resources who might need access to their money quickly.</paragraph>
<paragraph>"We do not consider that a product whose terms and conditions actually prevented withdrawals or transfers during a fixed term would be consistent with Ministers' intentions or the statutory rules."</paragraph>
<paragraph>So serious are the Revenue's concerns that they no longer trust banks and building societies to monitor themselves. They plan to introduce a new requirement which insists institutions submit full details of all Isa accounts to them for approval, before they becomes available to the public. Until now, banks merely had to satisify themselves they were not breaking the rules.</paragraph>
<paragraph>Paul James, marketing manager of Julian Hodge Bank, said the fixed accounts were launched in response to customer demand.</paragraph>
<paragraph>He said:"People were asking for a fixed rate, and those who have taken them out over the past couple of years have done well. We understood that Isas had to be transferable on notice, which we took to be the end of the fixed period.</paragraph>
<paragraph>"We were audited by the Revenue last year, and the issue was not raised at that stage, so we thought everything was OK. If we have been successfully audited then the accounts can't be otherwise but OK surely?"</paragraph>
<paragraph>Nigel Snell, a spokesman for Liverpool Victoria said that its range of five Isas prohibiting withdrawals did include a proviso in the small print allowing access to funds "in exceptional circumstances and by permission of the trustees".</paragraph>
<paragraph>He added:"We make it clear to people that their money is locked away, but thought the account was compliant because of this clause in the small print."</paragraph>
<paragraph>An Ipswich spokesman said it was aware there was some ambiguity when its account was launched last week. However, it acknowledged it may have to review its terms and conditions again, after seeing the letter from the Revenue.</paragraph>
</text>
<date>
<day>10</day>
<month>05</month>
<year>2002</year>
</date>
<source>Sexymoney</source>
<portal>Finance</portal>
<ID number="2610981009.98103"/>
</article>
</document>
evaluates the XPath 2.0 expression and copies to the output all nodes selected:
<text>
<paragraph>Tax relief could be withdrawn from tens of thousands of Individual Savings Account, sheltering up to £500 million from tax, because some banks and building societies are flouting not only the spirit, but the letter of the laws governing them.</paragraph>
<paragraph>This will come as alarming news to thousands of savers who invested in fixed rate Isas which they thought would offer them security combined with a tax shelter The Inland Revenue is writing to all Britain's savings institutions reminding them of the basic Isa rules and warning that tax relief will be withdrawn if these are manipulated in anyway.</paragraph>
<paragraph>Furthermore, the Revenue will castigate some deposit-takers for already breaching the rules and demand they speedily rewrite their terms and conditions, or face loss of tax-relief.</paragraph>
<paragraph>The guardian of the nations coffers is known to be deeply disturbed at the rapid recent growth of fixed- rate Isas which lock savers in for several years, maintaining these are unlawful. It is sending a stark reminder to all savings institutions that customers must be allowed to withdraw their cash, and requiring those which don't, to address the violation.</paragraph>
<paragraph>At the front of the firing line are Bradford & Bingley, Liverpool Victoria, and Julian Hodge Bank, which attracted huge inflows of money because they have been consistently among the best buys. All three specifically prohibit withdrawals. The Ipswich Building Society joined them last week, becoming the latest to offer a fixed- rate Isa, outlawing withdrawals until the end of the term.</paragraph>
<paragraph>A written statement from the Revenue said:"When ISAs were introduced Ministers made it clear that one of the their main objectives was to encourage non-savers to start saving, and people with small amounts saved, to save more. It was therefore important that ISAs would not lock in savers' money as this would exclude people with limited resources who might need access to their money quickly.</paragraph>
<paragraph>"We do not consider that a product whose terms and conditions actually prevented withdrawals or transfers during a fixed term would be consistent with Ministers' intentions or the statutory rules."</paragraph>
<paragraph>So serious are the Revenue's concerns that they no longer trust banks and building societies to monitor themselves. They plan to introduce a new requirement which insists institutions submit full details of all Isa accounts to them for approval, before they becomes available to the public. Until now, banks merely had to satisify themselves they were not breaking the rules.</paragraph>
<paragraph>Paul James, marketing manager of Julian Hodge Bank, said the fixed accounts were launched in response to customer demand.</paragraph>
<paragraph>He said:"People were asking for a fixed rate, and those who have taken them out over the past couple of years have done well. We understood that Isas had to be transferable on notice, which we took to be the end of the fixed period.</paragraph>
<paragraph>"We were audited by the Revenue last year, and the issue was not raised at that stage, so we thought everything was OK. If we have been successfully audited then the accounts can't be otherwise but OK surely?"</paragraph>
<paragraph>Nigel Snell, a spokesman for Liverpool Victoria said that its range of five Isas prohibiting withdrawals did include a proviso in the small print allowing access to funds "in exceptional circumstances and by permission of the trustees".</paragraph>
<paragraph>He added:"We make it clear to people that their money is locked away, but thought the account was compliant because of this clause in the small print."</paragraph>
<paragraph>An Ipswich spokesman said it was aware there was some ambiguity when its account was launched last week. However, it acknowledged it may have to review its terms and conditions again, after seeing the letter from the Revenue.</paragraph>
</text>
<date>
<day>10</day>
<month>05</month>
<year>2002</year>
</date>
<source>Sexymoney</source>
This XSLT 1.0 transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:copy-of select=
"/*/article/head/following-sibling::*[following-sibling::portal]"/>
</xsl:template>
</xsl:stylesheet>
evaluates the XPath 1.0 expression on the provided XML document (above) and copies to the output the selected nodes, producing exactly the same result (above).

If there is only one head and portal node on the first level, you can enumerate all nodes that are after a head and before a portal node, and then take their children:
//head/following::*[following::portal]/descendant-or-self::*
or if there are multiple articles:
//article/head/following-sibling::*[following-sibling::portal]/descendant-or-self::*

Related

how to extract <code> content from html using scrapy

I inspect the following content that I want to extract.
<code style="display: none" id="bpr-guid-1441788">
{"companyDetails":{"com.linkedin.voyager.jobs.JobPostingCompany":{"companyResolutionResult":{"entityUrn":"urn:li:fs_normalized_company:166973","name":"World Wildlife Fund","logo":{"image":{"com.linkedin.voyager.common.MediaProcessorImage":{"id":"/p/3/000/093/367/1651958.png"}},"type":"LOGO_LEGACY"}},"company":"urn:li:fs_normalized_company:166973"}},"entityUrn":"urn:li:fs_normalized_jobPosting:324588733","formattedLocation":"Bozeman, Montana","jobState":"LISTED","description":{"attributes":[{"start":572,"length":1,"type":{"com.linkedin.pemberly.text.LineBreak":{}}},{"start":0,"length":574,"type":{"com.linkedin.pemberly.text.Paragraph":{}}},{"start":574,"length":1,"type":{"com.linkedin.pemberly.text.LineBreak":{}}},{"start":574,"length":2,"type":{"com.linkedin.pemberly.text.Paragraph":{}}},{"start":576,"length":18,"type":{"com.linkedin.pemberly.text.Paragraph":{}}},{"start":594,"length":316,"type":{"com.linkedin.pemberly.text.ListItem":{}}},{"start":910,"length":134,"type":{"com.linkedin.pemberly.text.ListItem":{}}},{"start":1044,"length":160,"type":{"com.linkedin.pemberly.text.ListItem":{}}},{"start":1204,"length":342,"type":{"com.linkedin.pemberly.text.ListItem":{}}},{"start":1546,"length":270,"type":{"com.linkedin.pemberly.text.ListItem":{}}},{"start":594,"length":1222,"type":{"com.linkedin.pemberly.text.List":{"ordered":false}}},{"start":1817,"length":1,"type":{"com.linkedin.pemberly.text.LineBreak":{}}},{"start":1834,"length":1,"type":{"com.linkedin.pemberly.text.Paragraph":{}}},{"start":1835,"length":147,"type":{"com.linkedin.pemberly.text.ListItem":{}}},{"start":1982,"length":129,"type":{"com.linkedin.pemberly.text.ListItem":{}}},{"start":2111,"length":130,"type":{"com.linkedin.pemberly.text.ListItem":{}}},{"start":2241,"length":92,"type":{"com.linkedin.pemberly.text.ListItem":{}}},{"start":2333,"length":189,"type":{"com.linkedin.pemberly.text.ListItem":{}}},{"start":1835,"length":687,"type":{"com.linkedin.pemberly.text.List":{"ordered":false}}},{"start":2522,"length":1,"type":{"com.linkedin.pemberly.text.LineBreak":{}}},{"start":2522,"length":1,"type":{"com.linkedin.pemberly.text.Bold":{}}},{"start":2522,"length":2,"type":{"com.linkedin.pemberly.text.Paragraph":{}}},{"start":2524,"length":12,"type":{"com.linkedin.pemberly.text.Bold":{}}},{"start":2524,"length":66,"type":{"com.linkedin.pemberly.text.Paragraph":{}}},{"start":2590,"length":1,"type":{"com.linkedin.pemberly.text.LineBreak":{}}},{"start":2590,"length":1,"type":{"com.linkedin.pemberly.text.Bold":{}}},{"start":2590,"length":2,"type":{"com.linkedin.pemberly.text.Paragraph":{}}},{"start":2592,"length":9,"type":{"com.linkedin.pemberly.text.Bold":{}}},{"start":2592,"length":10,"type":{"com.linkedin.pemberly.text.Paragraph":{}}},{"start":2602,"length":17,"type":{"com.linkedin.pemberly.text.Bold":{}}},{"start":2619,"length":12,"type":{"com.linkedin.pemberly.text.Bold":{}}},{"start":2631,"length":78,"type":{"com.linkedin.pemberly.text.Bold":{}}},{"start":2602,"length":108,"type":{"com.linkedin.pemberly.text.ListItem":{}}},{"start":2710,"length":88,"type":{"com.linkedin.pemberly.text.Bold":{}}},{"start":2710,"length":89,"type":{"com.linkedin.pemberly.text.ListItem":{}}},{"start":2602,"length":197,"type":{"com.linkedin.pemberly.text.List":{"ordered":false}}},{"start":2799,"length":177,"type":{"com.linkedin.pemberly.text.Bold":{}}},{"start":2799,"length":177,"type":{"com.linkedin.pemberly.text.Paragraph":{}}},{"start":2976,"length":0,"type":{"com.linkedin.pemberly.text.Paragraph":{}}}],"text":"World Wildlife Fund (WWF), the world’s leading conservation organization, seeks a Data Analyst. Under the direction of the supervisor, this position is responsible for providing data synthesis and analysis for the Northern Great Plains (NGP). S/he will assist the NGP program in communicating its goals and successes through the development of data synthesis products and developing statistical interpretation of existing and new datasets, with a focus on informing grassland conservation. S/he will develop products to disseminate information to NGP staff and partners. \n \n Responsibilities Provide data synthesis and interpretation for existing and new datasets to support grassland conservation goals. Work with existing datasets to develop new ways of interpreting the data and communicating it to partners. Work with new datasets to help answer key science questions as outlined by the Program Manager. Given the list of science priorities for the program, develop methods for answering pressing questions using the best available data. Develop spatial data for use in projects. Collect and process datasets for use by NGP staff and partners, as needed and in partnership with the GIS Specialist. Support NGP Program by developing in-depth knowledge of grassland conservation and researching and developing skills in other approaches necessary to ensure success of WWF’s conservation strategies in the region. Build knowledge through research to keep up to date with the state of the art knowledge and apply the knowledge to WWF projects. The candidate will report to the Program Manager. S/he will also maintain strong relationships with the Managing Director, Deputy Director, NGO partner organizations, federal, state and provincial agency planning personnel and corporate and foundations staff at WWF-US. \n Qualifications A Master of Science Degree in Biostatistics, Biology, Conservation Biology, Zoology, Ecology, Wildlife Management, or a related field, is required 4+ years of experience in spatial analysis and data synthesis is required. A PhD will substitute for 3 years of work experience. Substantial and demonstrated experience in spatial analysis; data synthesis; and managing an independent work program is required Experience in biodiversity conservation and grassland-focused spatial datasets is preferred Candidates should have a strong commitment to the mission, goals, and values of WWF, good interpersonal and relationship-building skills, energy and enthusiasm, and high ethical standards. \n Please Note: This is a 2-year position based in Bozeman, Montana. \n To Apply: Please visit our Careers Page, job#17065, to submit an online application including resume and cover letter Due to the high volume of applications we are not able to respond to inquiries via phone As an EOE/AA employer, WWF will not discriminate in its employment practices due to an applicant’s race, color, religion, sex, national origin, and veteran or disability status."},"applyMethod":{"com.linkedin.voyager.jobs.OffsiteApply":{"applyStartersPreferenceVoid":true,"companyApplyUrl":"https://careers-wwfus.icims.com/jobs/1727/data-analyst---17065/job"}},"title":"Data Analyst","listedAt":1496950791000}
</code>
I tried several different ways to extract the content, especially the longest text part, such as
body.xpath('//code[#id="bpr-guid-1441788"]/text()').extract()
But there is no response, the return of scrapy is null.
Anyone can help me out?

How to manage transactions, debt, interest and penalty?

I am making a BI system for a bank-like institution. This system should manage credit contracts, invoices, payments, penalties and interest.
Now, I need to make a method that builds an invoice. I have to calculate how much the customer has to pay right now. He has a debt, which he has to pay for. He also has to pay for the interest. If he was ever late with due payment, penalties are applied for each day he's late.
I thought there were 2 ways of doing this:
By having only 1 original state - the contract's original state. And each time to compute the monthly payment which the customer has to make, consider the actual, made payments.
By constantly making intermediary states, going from the last intermediary state, and considering only the events that took place between the time of these 2 intermediary states. This means having a job that performs periodically (daily, monthly), that takes the last saved state, apply the changes (due payments, actual payments, changes in global constans like the penalty rate which is controlled by the Central Bank), and save the resulting state.
The benefits of the first variant:
Always actual. If changes were made with a date from the past (a guy came with a paid invoice 5 days after he made the payment to the bank), they will be correctly reflected in the results.
The flaws of the first variant:
Takes long to compute
Documents printed with the current results may differ if the correct data changes due to operations entered with a back date.
The benefits of the second variant:
Works fast, and aggregated data is always available for search and reports.
Simpler to compute
The flaws of the second variant:
Vulnerable to failed jobs.
Errors in the past propagate until the end, to the final results.
An intermediary result cannot be changed if new data from past transactions arrives (it can, but it's hard, and with many implications, so I'd rather mark it as Tabu)
Jobs cannot be performed successfully and without problems if an unfinished transaction exists (an issued invoice that wasn't yet paid)
Is there any other way? Can I combine the benefits from these two? Which one is used in other similar systems you've encountered? Please share any experience.
Problems of this nature are always more complicated than they first appear. This
is a consequence of what I like to call the Rumsfeldian problem of the unknown unknown.
Basically, whatever you do now, be prepared to make adjustments for arbitrary future rules.
This is a tough proposition. some future possibilities that may have a significant impact on
your calculation model are back dated payments, adjustments and charges.
Forgiven interest periods may also become an issue (particularly if back dated). Requirements
to provide various point-in-time (PIT) calculations based on either what was "known" at
that PIT (past view of the past) or taking into account transactions occurring after the reference PIT that
were back dated to a PIT before the reference (current view of the past). Calculations of this nature can be
a real pain in the head.
My advice would be to calculate from "scratch" (ie. first variant). Implement optimizations (eg. second variant) only
when necessary to meet performance constraints. Doing calculations from the beginning is a compute intensive
model but is generally more flexible with respect to accommodating unexpected left turns.
If performance is a problem but the frequency of complicating factors (eg. back dated transactions)
is relatively low you could explore a hybrid model employing the best of both variants. Here you store the
current state and calculate forward
using only those transactions that posted since the last stored state to create a new current state. If you hit a
"complication" re-do the entire account from the
beginning to reestablish the current state.
Being able to accommodate the unexpected without triggering a re-write is probably more important in the long run
than shaving calculation time right now. Do not place restrictions on your computation model until you have to. Saving
current state often brings with it a number of built in assumptions and restrictions that reduce wiggle room for
accommodating future requirements.

Amount to show on a bill form

My company is currently setting up an online billing portal for our customers. I was curious as this question went back and forth a bit between developers and testers: When showing the input form for the amount a customer wishes to pay, do you set the default to be the max amount owed by the customer? Taking a look around at sites when I pay my own bills I tend to see three different setups:
Max amount owed is in the input
Nothing is put in
Button options to pay off max, minimum, or your own input
In general we agree that your max and min amount should be shown on the screen somewhere (it's annoying to go look for your bill when the site can show amount owed). Is there a standard or what seems most friendly? Option 1 is nice cause it's all there but might annoy a customer a bit or a customer might accidentally pay off a large amount without realizing it (sounds dumb but you know it'll happen to someone). Option 2 gives the feeling of payment control to a customer but annoys them with having to input an amount everytime. Option 3 looks to be a middle ground but seems like a bit more unneeded work and upkeep when 1 and 2 are simpler and cleaner to look at.
I'd instinctively go for (1) - default to full amount. However, I grew up in an environment where debt wasn't taken lightly.
You should have a confirmation page with the amount payable anyway - since I might enter a wrong amount and press enter. So the "paying to much" argument doesn't really cut it.
Using the full amount as default can be a slight nudge towards paying all of it. With a major volume of payments, this might be notable.
I would not default to smaller amount. A customer might overlook that it's not the full amount, consider the deal done and miss the further payments. With a good layout ("Amount Remaining") that can be avoided in almost all cases - but with a large trade volume, you might create a few annoyed customers.
Can you query your own payment system to see what kind of payments your customers are making most often? Then set that as your default. I'd give them all options, though, including max, min, and custom.

Scrum in a fixed cost project [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I have read the agile manifesto and spend a nice day surfing the web in search for this elusive answer. But sadly I did not get an answer that would cover all the bases.
When watching all the blog posts and newscasts of Agile preachers, you just hear about open scope or open "time" projects. How do you apply this to a fix cost project?
From what I found out the biggest problem is scope management. How do you determine if something is not inside the projected scope and how do you formulate arguments for your decision? Because of the agile way you are implementing your software there is no detailed design to argue upon. In most cases you only have a vague wish-list that the customer hands to you. And is so general that you can interpret any feature into it.
And with the rising percentage of fixed-cost projects this seams to me to be a real issue.
So the questions would be:
How do you manage scope in a fix cost project?
How do you determine if the features wished for, are outside the original scope?
To me, the short answer about Agile and fixed price is that you can't do it, at least not with a fixed scope.
I know some people will say "that's not true, we are doing it" but, with all due respect, I don't think they are really doing Agile and I'll explain why. Actually the explanation is quite simple: fixed price implies fixed scope and is based on predictability where Agile is all about variable scope, scope management and adaptivity. So fixed price with fixed scope is basically the opposite of Agile.
With an Agile approach, fixed price gives you a number of iterations for a given team size. During these iterations, the customer will be able to have the team build the most valuable features first and thus to maximize the generated business value. The whole idea is then to stop iterating when the cost of an iteration is greater than the generated value. This is how Agile works.
So when people says they do fixed price with fixed scope in an agile way, they actually introduce some constraints that are not really compatible with the Agile theory - like doing an up-front estimation of a given set of features and freezing these features and estimations - and they loose important advantages of Agile (unless they have a perfect knowledge of the technologies and of the business domain and master them enough to predict everything but I know few projects that are like this).
Here is anyway a good compilation of various Agile contracts: 10 Contracts for your next Agile Software Project that might be helpful. But I think they all require some education of customers, especially the one that are used to fixed price with fixed scope (and late deliveries).
Scrum does not replace having proper requirements, or even having occasional major releases or milestones. Rather, it gives you a means to keep your team productive and focused, and avoids the time-wasting side-effects of a waterfall process.
In fact, one of the biggest advantages of an agile process like Scrum is that it causes you to "fail quickly and loudly" on problematic areas of your project. If, after a couple of sprints, your team still can't effectively estimate the time and resources needed to implement a particular feature, it may be worth pushing back on the requirements in that area -- they may need to be clarified, simplified, or scrapped altogether. In a traditional waterfall process, however, those "problem features" can often be pushed back to the last possible minute, resulting in the usual deathmarch and under-delivery into which most projects devolve.
However, the role of the Product Owner is even more critical in teams using Scrum who have a large set of requirements. Left to their own devices, most development teams will focus on the most interesting/fun/geeky features (service APIs, caching, search) first, and leave the "messy" stuff like payment process, UX design, and i18n until the last minute. A strong user voice is essential to making sure those features critical to the end user receive their fair share of attention.
Okay, this will not be the ideal answer you are looking for, but may help non-the-less.
For your first point:
With agile, and Scrum in particular, the style is suited toward changing specifications and unfixed deadlines using iteration patterns. To be able to manage this in a fixed scope project will be a nightmare. What one would normally do is set a budget for the specified scope, and any addendum to this would produce billable hours above and beyond the scoped budget. To do this in Scrum would be pointless, as the product backlog will be continually filled by the stakeholders. If there is no "punishment" for scope changes in a fixed budget, there will be nothing holding people back from just loading on to you.
The alternative here is to have fixed scope sprint successions, so for instance:
5x Sprints = x Cost with minimal scope change.
For your second point:
The use of Analysis and Design is an invaluable tool. By using use cases, event tables, sequence diagrams, state machines and the like; you will be saving yourselves oceans of tears in the long run. Basically, once the planning has been done, any addendum to this that requires additional (please note additional, not things that have been overlooked) use cases and large code changes will be out of scope. In fact, anything that was not overlooked in the planning and is not in your specification, is out of scope.
In closing, you will need to have very well planned documentation as well as very solid agreements with your clients to be able to pull this off 100%.
I hope this helps.
I worked in a environment where we had fixed cost and fixed time projects. We has switched to a Scrum-esque methology from a Waterfall/VModel methology. Scrum can work very well in fixed cost/time projects as the concept is that the customer is put in control, however for this to work you have to be able to somewhat accuratly determine what work is required and what it will cost (time, money, resource). And this is a situtation where Scrum in an ideal candidate.
You break down the wishy-washy wish list/requirements/screenshots into tagiable deliverables. E.g. a customer may say "I want ecommerce, with Paypal", you need to break this down into actual deliverables e.g. "1. Customer Registration and Login, 2. Product Catalogue, 3. Shopping Bag, 4. Payment, 5. Order Acknowlegment". At this stage, it's still impossible to determine how long it will take, and ofc we need to deliver all of the above in order to complete the project (i.e. you can't have Ecommerce without Payment). So break them down again, and again, until you have granular deliverables, genreally delverable within hours, maybe days, but certainly not weeks e.g.
1 Catalogue
1a View all Items
1ai View all items on 1 page with an image and item name underneath in a grid, 4 items per row
1aii View 10 items per page with paging
1aiii View a user slected number of items per page, with paging
1aiiii View all items on 1 page with an image and item name, descriptioon and price on the same line, 1 item per row
1b View by Category
...
1c Search
...
1d Attribute Filter
...
And so on, it can be done very quickly, and you can now probably guesstimate how long it would take todo x (ofc, I might break the above down even further, add more descriptive text to describe the work required, such as what persistant data stuctures Ill might need, the data in those structures, how data will be added, going further you might even desribe the required the begin and exit states).
Once you've go this, you'll notice that some features and depenant on others, e..g you can't have paging feature on a catalogue unless you have a catalogue to start witj, and the catagloge will require the CMS screesn to add and edit items etc etc. Highlight these 'can't live without feature' in whatever tool you using and this forms the core project, and within a day or two you have a bunch of features that can be developed somewhat standalone, with costs, which when added up make the cost of the project. And now the customer is in charge, they decide thay want to added a feature and increase the cost, cool, its up to them afterall.
All the above is obviously only a small portion of what scrum or any agile process is.
I don't think a fixed price contract with scope creep and a Scrum process are incompatible. You just need to agree up front with your customer how it will work. If you create your initial backlog with your customer, estimating as you go, you can use that as your basis for the fixed price cost and schedule. You can even agree to a rate of "X" story points equals "Y" cost and "Z" schedule at the beginning.
You then do the normal scrum thing, having the customer allocate stories to the current iteration, etc.
As the customer engages in scope creep, you work with them to add the "creep" as user stories to the backlog. Each time you add a new story, point out that for each X points added to the backlog, they will have to increase cost by Y and schedule by Z, or, they will have to give up story points of equal value. Since they are picking what you work each iteration, the points they give up (if that's the choice) will be the least valuable features. When your schedule runs out, you will be left with a backlog of the least important features that they can choose to drop or give you a new contract to finish.
The trick, of course, is to be good at estimating cost and schedule for each story/task ;-)
The project could be broken down into smaller parts and fixed rates could be attached to those. The other phases of the project could then be adjusted.
You have to be able to sell the agile process against your competitors. If a client has a history of fixed bid projects that were delivered on time, spec and cost, why would they waste their time taking bids from other developers?
Fixed Cost does not mean single sprint. Scope gets transfered to the Product Backlog, and as Sprints progress, scope is adjusted, negotiated and delivered. Scrum allows for rapid value delivery, and provides quick validation, and the opportunity to identify potential gold plating.
Scope change may result in the addition of backlog items, and the deletion of others. Its a balance of ROI vs the fixed budget provided.
If the scope does increase (and add value), and the cost is fixed, then the triple constraint (cost, time and scope) must be managed accordingly.
Remember that fixed cost does not mean fixed length.

Time Calendar Data Structure

We are looking at updating (rewriting) our system which stores information about when people can reserve rooms etc. during the day. Right now we store the start and time and the date the room is available in one table, and in another we store the individual appointment times.
On the surface it seemed like a logical idea to store the information this way, but as time progressed and the system came under heavy load, we began to realize that this data structure appears to be inefficient. (It becomes an intensive operation to search all rooms for available times and calculate when the rooms are available. If the room is available for a given time, is the time that it is available long enough to accommodate the requested time).
We have gone around in circles about how to make the system more efficient, and we feel there has to be a better way to approach this. Does anyone have suggestions about how to go about this, or have any places where to look about how to build something like this?
I found this book to be inspiring and a must-read for any kind of database involving time management/constraints:
Developing Time-Oriented Database Applications in SQL
(Added by editor: the book is available online, via the Richard Snodgrass's home page. It is a good book.)
#Radu094 has pointed you to a good source of information - but it will be tough going processing that.
At a horribly pragmatic level, have you considered recording appointments and available information in a single table, rather than in two tables? For each day, slice the time up into 'never available' (before the office opens, after the office closes - if such a thing happens), 'available - can be allocated', and 'not available'. These (two or) three classes of bookings would be recorded in contiguous intervals (with start and end time for each interval in a single record).
For each room and each date, it is necessary to create a set of 'not in use' bookings (depending on whether you go with 'never available', the set might be one 'available' record or it might include the early shift and late shift 'never available' records too).
Then you have to work out what questions you are asking. For example:
Can I book Room X on Day Y between T1 and T2?
Is there any room available on Day Y between T1 and T2?
At what times on Day Y is Room X still available?
At what times on Day Y is a room with audio-visual capabilities and capacity for 12 people available?
Who has Room X booked during the morning of Day Y?
This is only a small subset of the possibilities. But with some care and attention to detail, the queries become manageable. Validating the constraints in the DBMS will be harder. That is, ensuring that if the time [T1..T2) is booked, then no-one else books [T1+00:01..T2-00:01) or any other overlapping period. See Allen's Interval Algebra at Wikipedia and other places (including this one at uci.edu).

Resources