ruby Screen Scraping Library for formatting times - ruby

I am wondering if there is a library method out there that will take a time string of unknown format and reformat it into a standard format (i.e. HHMM). Examples of the type of thing I am getting from websites are.
1030 10:30 10pm 10PM 1030PM 10pm
1030PM 1030p.m. 1030pm. 930 930am
9am 8.30 8.30pm
and I am sure there are others.
I started to write a method and it's getting there (https://gist.github.com/funkytwig/b47551e98e8698ebb59310286982a6ce) but wondering if there is already one around. It is worth mentioning I have come across websites where the times in the same list (i.e. event listing) are not consistent, I think they are hand typed into a text field when input.
Just to clarify I am wondering if there is a method in a library already existing, i'm not asking people to debug my code. I'm just sharing it to show what I have done to try to solve the problem, and you will see why I am hoping there is a library.

Try chronic. It can parse a whole lot of time formats, including the ones that you gave.

Related

How can I change a file's creation date in Go?

Can I change a file's creation time (apparently also known as birth time or btime) in Go, and if so, how?
I've found a couple of ways to deal with file times in Go, including os.Chtimes and this package — however, the former does not support btime, and the latter seems to deal only with reading the dates, not changing them.
I'm looking for a Windows solution, though a general answer won't hurt either.

ActiveSupport::TimeZone not recognized in Rspec tests

I am using ActiveSupport::TimeZone to set the time zone on a location based on the zip code.
def set_time_zone
self.time_zone = ActiveSupport::TimeZone.find_by_zipcode(self.zip)
end
This works just fine in the application itself. I am calling the set_time_zone on before_save.
In running the tests with Rspec, when it tries to run the set_time_zone method it errors out with "undefined method 'find_by_zipcode'in ActiveSupport::TimeZone"
I have included "require 'active_support/time_with_zone'" in my spec helper as well.
For now my work around is excluding the before save if in test environment.
Any ideas would be great.
find_by_zipcode is not part of the main ActiveSupport::TimeZone object. The docs for that object are here, and you won't find any mention of zip codes.
A Google search found that method as part of the TZip gem. Since you said it works in your application, I would guess that you have that gem there. You probably just need to add it to your test project. (Sorry, not familiar with Ruby or RSpec all that well, so can't guide you there).
Being quite familiar with time zones, I thought I would also take this opportunity to address a few concerns about the general idea of mapping zip codes to time zones. I'm not so sure that it is a great idea.
It is very U.S. focused. Time Zones are worldwide, and zip codes only work in the USA.
Zip codes change frequently. The USPS publishes databases that you can subscribe to for changes to this data. It would appear from the TZip commit history and issue tracker that they have been manually adding zip code mappings as problems are reported. This is not a good way to handle data that is frequently changing.
A zip code is not the best boundary to identify a location. There are many zip codes that cover disparate, non-contiguous areas. There are also administrative zip codes that don't map to any particular location (like those for overseas military mail).
For those databases that assign a latitude and longitude to a particular zip code, those coordinates are often artificially chosen, as an approximation of the centroid of the area serviced by that zip code. Again, this is not a discrete location.
According to the TZip source code, there are only 7 time zones covered by these mappings. They have forgotten about US territories that also have zip codes, such as Guam. Others, like Puerto Rico have been erroneously mapped to the Eastern time zone instead of the Atlantic time zone.
So my recommendation would be to avoid this approach entirely. Instead, use one of the methods described in this community wiki.

How would you start automating my job?

At my new job, we sell imported stuff. In order to be able to sell said stuff, currently the following things need to happen for every incoming shipment:
Invoice arrives, in the form of an email attachment, Excel spreadsheet
Monkey opens invoice, copy-pastes the relevant part of three columns into the relevant parts of a spreadsheet template, where extremely complex calculations happen, like =B2*550
Monkey sends this new spreadsheet to boss (email if lucky, printer otherwise), who sets the retail price
Monkey opens the reply, then proceeds to input the data into the production database using a client program that is unusable on so many levels it's not even worth detailing
Monkey fires up HyperTerminal, types in "AT", disconnect
Monkey sends text messages and emails to customers using another part of the horrible client program, one at a time
I want to change Monkey from myself to software wherever possible. I've never written anything that interfaces with email, Excel, databases or SMS before, but I'd be more than happy to learn if it saves me from this.
Here's my uneducated wishlist:
Monkey asks Thunderbird (mail server perhaps?) for the attachment
Monkey tells Excel to dump the spreadsheet into a more Jurily-friendly format, like CSV or something
Monkey parses the output, does the complex calculations
Monkey sends a link to the boss with a web form, where he can set the prices
Monkey connects to the database, inserts data
Monkey spams costumers
Is all this feasible? If yes, where do I start reading? How would you improve it? What language/framework do you think would be ideal for this? What would you do about the boss?
There are lots of tools that you could apply here, including Python, Excel macros, VB Script, etc.
In this case, PowerShell seems like an excellent choice, as it naturally combines COM access to Office, .NET, and scripting, and is all-around-awesome. If you already know a suitable technology, you'll get the job done fastest with what you know. Othewise, PowerShell.
(C# 4.0 is also reasonable, although earlier versions suck when interacting with Office's COM interfaces.)
Don't get carried away trying to solve the whole problem at once. Start by picking a small, easy part that gets you a lot of value right away. You are more likely to succeed this way. (To get your boss to agree, you need success fast. If you aren't telling your boss, you need success even faster!). Once you have that done, you can use your new-found free time (maybe only a few minutes per day) to extend your tools and skills to the next bite-sized piece. Success will accelerate success.
In time you will replace monkey with code, and either get a promotion or quit in disgust and get a better job.
The big parts are Excel and email. Excel can be handled with either COM or some sort of interaction with OpenOffice.org. Email, well, there's dozens of ways to do it. My hammer of choice is Python, along with pywin32 or PyUNO, and poplib and smtplib.
Boss... will always be boss. Not always very much you can do about the icky wetware stuff.
I'd start by asking myself the following questions
Does the invoice have to come via email or can there be a web form where the users can enter the data? There is a easy way to put a form on google docs so you can download the response in excel format in a common format set by you. I'm sure there are better ways too.
Does the boss need to create a new spreadsheet of can you provide him with a database app where he can view your form, enter the price, check "approved" and have that fire off the process that puts it in the production db?
Can the interface to the client program be worked around? Can you have some other app call the client
Can the text to the end user be sent by you and not the client app? If so, ca you automate that part
Just some thoughts.
One solution to #1 is to send email to a Unix server (instead of Exchange) and use procmail to dump out attachments (see http://gimpel.ath.cx/howto_fetch_proc_metamail.html for an example of how)
As for boss, have a nice web page which you can email him a link to. And send him a short email (3 lines or less) telling him that using that page will save him 30 mins of work over the course of a month and you 2 hours of work in a month. Just be prepared to back up the #s.
However, very high level, un less you're prepared to do the whole automation thing on your own time, you better be able to sell your boss that overall time savings x6 months are less than time to develop this. 'cause may be monkey salary in his eyes is low enough that the cost of software is just not worth it - and sadly he just may be right depending on how complicated a bulletproof robust solution is.
As I noted above, your last question is probably the most salient. It is probably best approached as a personal skunkwork project where you show the boss a completed product one day, collect your innovation bonus and then get fired because a stupider monkey can now do your job instead of you.

Why shouldn't I ask my users to enter times using military format

I have a form that asks users to enter a start and end time for an event. For many years, we have allowed them to enter the times by selecting the hour (1-12), minute (1-60), and AM/PM from three drop down boxes. This has worked fine without complaints from customers. However, today I was hit with a request to change the input to one text box for the user to enter time in military time (aka 0000 - 2359). In my gut I believe this is a bad idea but am having trouble coming up with any hard facts.
What are the best reasons I can give that this would be a bad idea?
If there is a better solution for entering time, what would it be?
Also, FYI the users filling out the form run the gamut from very little skill with computers to advanced users. They are in no way military related.
Update: All my users are local and no other forms (web or print) use military time as the standard.
Three dropdowns are a nightmare usability-wise. You can cut these down to two by eliminating AM/PM and moving to 24-hour format, but still: a dropdown with 60 items is overkill.
I'd much prefer to enter time "manually", provided that these input boxes will be intelligent enough (say, they should be able to convert 18 to 1800, 0 to 0000, allow : as a separator, etc.). Plus do not allow users to enter incorrect data in the first place.
To answer your question: I see no reason to disallow your users to do what they want. After all, they are users.
Well, from a user interface standpoint, this could be a mistake simply according to some of Jakob Nielsen's user interface heuristics:
"Match between system and real world." If your users are not used to entering dates in military time, asking them to do so for your app can be distracting at best, and frustrating at worst.
"Error prevention" You are not eliminating error-prone conditions, but possibly introducing them.
There is also the question of why this change is being made. Are customers complaining? Is data coming in incorrectly? As mentioned by others, are your users used to military time? Any interface change should happen for a reason, IMO, because you're going to change the user experience and there will be ramifications for that; it's just a matter of how large those ramifications will be. My assumption is that data entry errors are supposedly going to be avoided -- but are they? Asking a user to enter a time as "XX:XX" and parsing out the semicolon (or, as Aaron Digulla stated, ANY non-number characters) and then converting it as needed seems less likely to result in errors than asking a user to enter a time in a format they are not used to using daily.
My concern would be that a user wants to enter 3:30 PM, and, while not paying much attention, simply enters 330. This is now 3:30 AM, and the user will never know the difference, because the app takes the information and happily assumes that this is what is meant. However, allowing the user to enter the time in "XX:XX" format and having an "AM/PM" selection makes much more sense.
As far as hard facts, well, I don't have them either. But if your boss/client won't be swayed by Nielsen's heuristics, I'm not sure what can change their mind.
Oh my.
My advice is to quit and find a different project.
We did a scheduling app for a "military customer" - and even they could not agree on what constituted "military time". Half of them wanted something called "Zulu Time" - the other half wanted "GMT plus offset" - then some wanted local time in 24h format. Contrary to what our contract specified, a Colonel insisted we use "Zulu" - we made the change for political reasons (in violation of our contract) - and then HE missed showing up for a scheduled event, because he thought it was in local time. Then contract management came down on us like a ton of bricks.
(never mind that the published schedule also used an obsolete "offset" that was a cold-war holdover meant to "fool the Russians").
In that this is just me sharing a war-story. . .
The real answer is to Elicit Requirements from your customer. Get those requirements SPECIFICALLY written into your contract. Make sure that the stakeholder who is actually writing your check, agrees. Develop to that specification exactly. When someone complains tell them to pay for a contract mod. You'll probably be changing this back and forth among many different settings for the next 10 years. You'll have steady work, and you'll understand why military contracts frequently go way over budget and are never on schedule.
"They are in no way military related."
That's a good enough reason for me. It's an uncommon format that, while not exactly "user-hostile," is nonetheless not the way most of us are used to seeing dates, and requiring your users to do the conversion in their head will lead to arithmetic errors eventually.
That said, drop-down boxes aren't great either. Best to go with 2 input boxes and an AM/PM dropdown, in my opinion.
It may not be a bad idea. Imagine the case where users must enter that bit of information lots of times, for example because they are in call support. Or they may find the dropdown boxes not usable enough, even after having tried them. They may prefer that other format.
It is usually a good idea to talk to the stakeholder and ask him: "Why do you want it this way?" you can then contrast their ideas with yours, but if yours are only that you have the "gut" feeling that this is not right, guess who will win the argument. The gut feeling is not a valid business argument - especially when the business is not yours.
So in short, do what your customer wants - just make sure that they understand their options well, and point out to them any inconvenience that they may have foreseen - once you find one, that is.
Honnestly, I think using AM/PM format is a bad practice, but that may be because I'm used to the 24 hours scale.
One reason against is that if all your users are used to the 12H scale, then most of them might still enter 1:00 instead of 13:00 for 1:00. Since the PM is not here, it will result in mistakes.
However, one good reason to do the switch is simply because it's the international standard.
Depending of what you want to put the emphasis (speed or functionality) you can use a time picker that would rely on regional setting to diplay the time in the user format or use a clock-like control. If speed is important, you might prefer a simple mask-textbox.
Hmmm, describing the 24 hour clock as "military time" and then noting that the users are not military makes me a more than a little twitchy.
It will depend on your users but I think that it is more than reasonable to expect people in contemporary society to understand the 24 hour time format and to be able to enter times using that format (given that I would - possibly naively - expect that format to be in use for bus, train, plane and other timemtables almost universally for the simple reason that its unambiguous). Perhaps this is not true worldwide - but it is certainly true across Europe.
That said, changes need to be made for a reason - "if it ain't broke..." is a very sound maxim for a working site and whilst I wouldn't ever willingly use am/pm for time entry I don't have a problem with use of dropdowns for time entry - especially as one can type "into" them. In this case I think that going from drop downs to text boxes is most likely an opportunity to introduce errors (although again it rather depends on the users).
I can see why you think this is a bad idea, silly users input wrong format etc.
However have you considered a jQuery Masked input box?
In my own frames, I accept times and dates in a wide variety of formats. When the field loses focus, I'll try to parse the input and format it into the "correct" or "official" format. This gives the user a nice way to enter the data and a visual cue when something is wrong.
For example, in a date field, I'll accept "1" as "01.12.2009" (current month+year). In a time box, I'll accept "1030", "10 30", "10.30" (i.e. I just filter out anything which isn't a number). "010409 1125" becomes 1. April 2009, 11:25am.
Few outside the united states knows the words "military time". They also prefer 24-hour format.
If you want globalization, you can do one of the two:
use accepted and de-facto standards, such as ISO8601 date format, 24h time and speak English
dive into the nightmare of the vast regional-based localization complexity (some unfortunate programmers have to do it anyway. Then they support AM/PM, unicode and never-showing-yellow-color for certain cultures)
I cannot believe how much consideration this idea has gotten.
Forcing your user to do things your way, because it's "more efficient" is a terrible idea.
Your forms should be both streamlined (power users can enter data quickly from the keypad) and comprehendible (first time users can navigate successfully). The conversion to 24 hour time will throw people immediately. I lived in Quebec for almost six years and still had troubles switching back and forth from 24hour time. DON'T DO THIS.
Just in addition to all the rest of comments you should thing about one more thing.
Programmers and designers usually think the client pays us just for creating what he tells us to... That's only half true. They pay us, even if they don't realize it, for telling them what they need, what's best for them.
Of course, the final decision is always theirs, as the pay, but if you feel it is wrong and you think you know the business model better than them, then do not blindly accept whatever they told you to do.
You might want to consider using the jQuery timepicker (or Telerik DateTimePicker in Time-only mode for WinForms) and also build in support, on the backend, for multiple formats in the event that javascript is disabled.
date/time input through select boxes is a horrible UI design.
but, if some of your users come from the few countries that stick to AM/PM for time format, then forcing the "military" format on them without assistance from the program is also bad.
use something like the jQuery masked input plugin.
if i was doing this, i would use a masked text input and a "PM" checkbox: if the value is more than 1259, the checkbox is disabled. otherwise, it's clear by default.
Why not use a TimePicker control of some sort?
You shouldn't force non-military users to user a strange to them time format.
In any case, assuming that all input is by logged-in users, you can provide multiple mechanisms (and certainly multiple ways if displaying time) and make the choice a user preference. But I'd strongly recommend that whatever you do, for any given user times should be entered and displayed in a consistent manner.

Best GUI control(s) to describe a time range

I need to let end users specify a time range, to be stored and used internally as a starting date/time and ending date/time. The range could be minutes or it could be days.
Has anyone discovered an interactive control that can handle this elegantly?
Most GUI toolkits have a calendar control, so I could specify "start" with a calendar for the day and a text field for the time...and the same for "end".
I could also replace the "end" controls with a single text field or slider that simply describes how many seconds/minutes/hours after start "end" is.
What I don't like about these ideas is how much clicking, typing, and more clicking is required to describe such a simple concept. Also I have to slap the user's hand if a time is typed in that isn't recognizable as a time.
Is there a cleaner implementation that I'm overlooking?
I tend to look at common design patterns for inspiration when I'm pondering problems such as this.
The Yahoo Pattern Library offers some potential solutions.
The UI Patterns site also give some suggestions, and is worth a browse.
For good measure, here's another solution at the Welie pattern library.
Another source of inspiration might be other sites and applications. For example, think of all the use-cases where recording short and long time time durations is required. As an example, company TimeSheet recording, company car mileage log software, task recording software, stopwatch applications, calendaring apps, etc. Then see how they've handled the GUI controls for capturing time ranges.
I haven't personally found a favourite solution for picking date and time. But, I think I'd want something like this.
User clicks to show calendar popup
Popup shows 2 side-by-side calendars (start date/time and end date/time)
Calendar 1 shows todays date, and the other also shows todays date.
Calendar controls allow usual navigation and selection of day month year.
Below each calendar is a hh:mm box, which defaults to the current time.
User can edit value in this time box using up/down arrows or by typing.
Alternatively, show an analogue clock below each calendar. It takes 2 mouse clicks to set time( click 1 for hour and click 2 for minutes).
Hope this helps
I am a fan of an old control I saw used WAY back in the 90's with Inventor (and later Open Inventor) on SGI machines (and then on PCs, etc): an infinite dial.
Some screenshots, a little on the small side, are here. Course, its been done on a variety of platforms since, including similar things on the iphone.
I think a date/time picker would work well with two dials, each representing an order of date/time magnitude. In ASCII art, with each dial between [square brackets] it might look like:
[20 Oct | 21 OCT | 22 Oct ] [11:15 .. 11:30 .. 11:45..]
or with 3:
[20 Oct | 21 OCT | 22 Oct ] [11 .. 12 .. 1pm] [12:31 .. 12:32 .. 12:33]
There are a number of variations you could try (vertical/horizontal, date/time, date/hour/minute, etc).
Dials, though somewhat rarely used, are a natural device for humans to interact with, and their infinite rotation option (unlike a slide which must always stop) suits dates/times well.
FWIW
User interface design is heavily application dependent. "Best" implies some kind of metric that can measure solutions. In UI design such a metric can be "home many clicks/key-presses does it take to complete the task?" where a smaller number is better. So once you've defined your metric you can start to sort solutions into good, better and best.
You also want to reduce cognitive burden for the user. If the user has to enter the final day on which a product can be exchanged based on a 90-day return policy then asking for start and end date would force them to do date math which is no fun. In this example a start date with a "delta" of x days would place less of a burden on the user.
Depending on you application you could consider and approach like the Google Finance time range selector on their charts: http://finance.google.com/finance?q=.dji
There is no single answer, it depends on the context. For many places good text controls are enough. Of course such things can still help by supporting pasting and some increase/decrease actions. Maybe it can even do some validation for the value.
Then there are places that need something more. Calendar can be really helpful for entering dates and some kind of slider could be used for time. (Lotus Notes calendar has a slider.)
My advise is:
Think what you need. Don't put complicated widgets to a less used dialog.
If you need these nice helpful widgets, check if there are ready made in the library you are using and take some time to see how others have done these.
Always have the text controls with support for pasting.
Check out the VisualHint date control. It can be configured a multitude of ways including a timespan. This would allow you to use one control instance to show the start time and another to set the timespan until the period is complete. The control also supports an extensible base framework so you could possibly combine both start/end or start/span into a single control.
Here are also some solutions: http://quince.infragistics.com/html/PatternView.aspx?name=Date+Time+Range+Input
Unless there is a more advanced time control in your GUI toolkit of choice, two calendar controls representing start and end is the most straightforward. Also, you need to decide how you want to use the information. For example, if you used a start date and an interval to increment that date, changing the start date wouldn't change the meaning of the interval. It really depends on what you're wanting to do.
One way I've seen work very well is using a gantt chart:
http://en.wikipedia.org/wiki/Gantt_chart
You can create a single line chart and then you can scale it across months, days, hours and minutes depending on how wide or zoomed in you make the control. The problem is I don't know of any control out there right now that does just one line, so you may need to create a custom one. You could possibly look for a gantt chart control and just do one task/item.
Observe what people are doing with your time range control. Then write it so that it's most suited towards doing what the people want to achieve with it. For instance, leave away past dates if inputting future dates only makes sense.
Jonathan Leighton has made a nice date inputter -element in jquery that I've found very nice for inputting dates. This is beneficial in a way that user can both input the date by clicking or type it in directly. The user also directly gets the hint about typing it into the box. If you couple this with some kind of timeline -object, you may actually go far afar. Just avoid making UI elements that are confusing or angering!
This comes in late, and it's not a control per se. I read this idea on a blog I can't find anymore (in fact, I found this post while trying to find it). The idea is to use the metaphor of a wall clock. Here's what I implemented for the fun of it. It's not a functional control. You could use something like this as a starting point for capturing times naturally. Three clicks at most, two most of the time. Only dials come close.
http://www.viridium.ro/clock-sample/
Use a HTML5-aware browser; that is, Chrome.

Resources