How to persist reference data

How to persist reference data - windows-phone-7

After the every first start of my application I download all the necessary reference data (text file (csv format) with size of 1MB). This data contains about 30000 lines and each line is a data entry with name, latitude, longitude and height.
Whats the most performant way to save this data? I tried to store the list of these in the IsolatedStorageSettings. But this is absolutely the worst approach.
An other way to go is storing the text file in the IsolatedStorageFile directory and on each launching of the application loading the file and parse them to to my list.
The most inperformant part is reading the file. So I guess using a database like sqlite has the same issue, hasn't it?
How would you treat this issue?
Kind regards, Danny

I did something similar in WherOnEarth application. We have a SQLCE database that we store the data in and then load the stuff that is near by.
Background reading: http://www.silverlightshow.net/items/Windows-Phone-7.1-Local-SQL-Database.aspx & http://www.jeffblankenburg.com/2011/11/30/31-days-of-mango-day-30-local-database/
I have a sdf file that I ship with the app, a Data class shown below
[Table]
public class PointData : IPositionElement, INotifyPropertyChanged
{
[Column]
public string Description { get; set; }
[Column]
public double Latitude { get; set; }
[Column]
public double Longitude { get; set; }
Then we I read the points that are near by I get them with the following:
(from ht in _context.Points
where ht.Latitude >= bottomLeft.Latitude && ht.Latitude <= topRight.Latitude &&
ht.Longitude >= bottomLeft.Longitude && ht.Longitude <= topRight.Longitude
select ht
).ToArray();
This approach was fast enough for me (i.e. it took less time to get the items out of the sdf file than it did to position them on the screen an ddo all the other associated maths with that. Admitedly I wasnt trying to get 300000 items from the DB. There were more optimizations that I could have done in relation to indexing and things like that, but as I said it was quick enough at the moment so I will revisit it at some point later.

Related

EF 6.2 code first, simple query takes very long

In an old DB application I'd like to start moving towards code first approach.
There are a lot of SPs, triggers, functions, etc. in the database which make things error prone.
As a starter, I'd like to have a proof of concept, therefore I started with a new solution, where I imported the entire database (Add new item -> ADO.NET entity data model -> Code First from database)
As a simple first shot I wanted to query 1 column of 1 table. The table contains about 5k rows and the result delivers 3k strings. This takes over 90 seconds now!
Here's the code of the query:
static void Main(string[] args)
{
using (var db = new Model1())
{
var theList = db.T_MyTable.AsNoTracking()
.Where(t => t.SOME_UID != null)
.OrderBy(t => t.SOMENAME)
.Select(t => t.SOMENAME)
.ToList();
foreach (var item in theList)
{
Console.WriteLine(item);
}
Console.WriteLine("Number of names: " + theList.Count());
}
Console.ReadKey();
}
In the generated table code I added the column type "VARCHAR" to all of the string fields/column properties:
[Column(TypeName = "VARCHAR")] // this I added to all of the string properties
[StringLength(50)]
public string SOME_UID { get; set; }
I assume I miss out an important step, can't believe code first query is so slow.

I figured the root cause is the huge context that needs to be built, existing of over 1000 tables/files.
How I found the problem: using the profiler I observed that the expected query hits the database after about 90 seconds, telling me that the query itself is fast. Then I tried the same code in a new project, where I only imported the single table I access in the code.
Another proof that it's context related is executing the query twice in the same session; the second time was executed in the milliseconds.
Key point: if you have a legacy database with a lot of tables, don't use 1 single DbContext that contains all the tables (except for initializing the database), but several smaller domain specific ones with the tables you need for the given domain context. Entities can exist in multiple DbContexts, taylor the relationships (e.g. by "Ignore"-ing where not required) and do lazy loading where appropriate. These things help to boost performance.

storing business hours in Parse DB

Need some help with the infrastructure with storing business hours for a location on Parse.com, i already tried it as a separate Class called BusinessHours, where each row has a pointer to the Location class. Having a minimum of 7 rows for each day of the week for 1 location, the objects count comes to +10.000
than in swift i do this to determine if the location is open now
for hour in hours {
if hour.isClosedAllDay {
isOpen = "closed".localized
}else{
let now = NSDate()
if now.hasDayOffset(hour.weekday, closeWeekDay: hour.nextWeekday) {
if hour.open != nil && hour.close != nil {
let open = now.hourDateFromString(hour.open!, offset: now.dayOpenOffset(hour.weekday, closeWeekDay: hour.nextWeekday))
let close = now.hourDateFromString(hour.close!, offset: now.dayCloseOffset(hour.weekday, closeWeekDay: hour.nextWeekday))
if now.isBetween(open, close: close) {
isOpen = "open".localized
timeOfBusiness = hour.time!
break
}
}
}
}
}
Is there a better way to do this than to have thousands of rows for Business Hours only? I was thinking of adding a object field to the Location Class for the hours but don't know if that is the right way to go either.

Depending on how you want to edit and change the details, and the complexities of multiple opening times per day, I'd consider not using multiple columns and rows. Instead, you could simply store a JSON string in a single column which contains all of the required details.
Obviously you wouldn't be able to use this for querying, so if you need to do that then you need to keep something more like your current solution.
If you don't need querying, or you need simple querying like 'is it open at all on a Monday' then a combined solution, supported by cloud code so the app doesn't need lots of knowledge of the JSON, could work well. For instance you could have columns for general open hours each day and then details in JSON, so you can get a rough answer by querying and then check the exact detail before presentation / usage of the result.

I ended up doing it like this in an array field called businessHours in my Location class:
[
{"close":"20:00Z","open":"12:00Z","time":"09:00 - 17:00","isClosedAllDay":false,"nextWeekday":1,"weekday":1},
{"close":"20:00Z","open":"12:00Z","time":"09:00 - 17:00","isClosedAllDay":false,"nextWeekday":2,"weekday":2},
{"close":"20:00Z","open":"12:00Z","time":"09:00 - 17:00","isClosedAllDay":false,"nextWeekday":3,"weekday":3},
{"close":"20:00Z","open":"12:00Z","time":"09:00 - 17:00","isClosedAllDay":false,"nextWeekday":4,"weekday":4},
{"close":"20:00Z","open":"12:00Z","time":"09:00 - 17:00","isClosedAllDay":false,"nextWeekday":5,"weekday":5},
{"close":"20:00Z","open":"12:00Z","time":"09:00 - 17:00","isClosedAllDay":false,"nextWeekday":6,"weekday":6},
{"close":"20:00Z","open":"12:00Z","time":"09:00 - 17:00","isClosedAllDay":false,"nextWeekday":7,"weekday":7}
]
and then looping through the objects as a NSDictionary.
thanks Wain!

Apache Mahout Database to Sequence File

I am currently trying to play around with mahout. I purchased the book Mahout in Action.
The whole process is understood and with simple test data sets I was already successful.
Now I have a classification problem that I would like to solve.
the target variable is found, which I call - for now - x.
The existing data in our database has already been classified with -1, 0 and +1.
We defined several predictor variables which we select with an SQL query.
These are the product's attributes: language, country, category (of the shop), title, description.
Now I want them to directly be written in a SequenceFile, for which I wrote a little helper class that will append to the sequence file each time a new row of the SQL resultset has been processed:
public void appendToFile(String classification, String databaseID, String language, String country, String vertical, String title, String description) {
int count = 0;
Text key = new Text();
Text value = new Text();
key.set("/" + classification + "/" + databaseID);
//??value.set(message);
try {
this.writer.append(key, value);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
If I only had the title or so, I could simply store it in the value - but how do I store mutiple values like country, lang, and so on, in that particular key?
Thanks for any help!

you shouldnt be storing structures in a seq file, just dump all the text you have seperated by a space,
it's simply a place to put all your content for term counting and such when using something like Naive Bayes, it cares not about structure.
Then when you have classification, lookup the structure in your database.

What's the name of a table of values/frequencies?

I have a main data store which has a big set of perfectly ordinary records, which might look like (all examples here are pseudocode):
class Person {
string FirstName;
string LastName;
int Height;
// and so on...
}
I have a supplementary data structure I'm using for answering statistical questions efficiently. It's computed from the main data store, and it's a dictionary that looks like:
// { (field_name, field_value) => count }
Dictionary<Tuple<string, object>, int>;
For example, one entry of the dictionary might be:
(LastName, "Smith") => 345
which means in 345 of the Person records, the LastName field is "Smith" (or was, at the time this dictionary was last computed).
What is this supplementary dictionary called? I think it'd be easier to talk about if it had a proper name.
I might call it a "histogram", if I was to print the entire thing graphically (but it's just a data structure, not a visual representation). If I stored the locations of these values (instead of just their count) I might call it an "inverted index".

I think you have found the most appropriate name already: frequency table or frequency distribution.

StringLength vs MaxLength attributes ASP.NET MVC with Entity Framework EF Code First

What is the difference in behavior of [MaxLength] and [StringLength] attributes?
As far as I can tell (with the exception that [MaxLength] can validate the maximum length of an array) these are identical and somewhat redundant?

MaxLength is used for the Entity Framework to decide how large to make a string value field when it creates the database.
From MSDN:
Specifies the maximum length of array
or string data allowed in a property.
StringLength is a data annotation that will be used for validation of user input.
From MSDN:
Specifies the minimum and maximum
length of characters that are allowed
in a data field.

Some quick but extremely useful additional information that I just learned from another post, but can't seem to find the documentation for (if anyone can share a link to it on MSDN that would be amazing):
The validation messages associated with these attributes will actually replace placeholders associated with the attributes. For example:
[MaxLength(100, "{0} can have a max of {1} characters")]
public string Address { get; set; }
Will output the following if it is over the character limit:
"Address can have a max of 100 characters"
The placeholders I am aware of are:
{0} = Property Name
{1} = Max Length
{2} = Min Length
Much thanks to bloudraak for initially pointing this out.

Following are the results when we use both [MaxLength] and [StringLength] attributes, in EF code first. If both are used, [MaxLength] wins the race. See the test result in studentname column in below class
public class Student
{
public Student () {}
[Key]
[Column(Order=1)]
public int StudentKey { get; set; }
//[MaxLength(50),StringLength(60)] //studentname column will be nvarchar(50)
//[StringLength(60)] //studentname column will be nvarchar(60)
[MaxLength(50)] //studentname column will be nvarchar(50)
public string StudentName { get; set; }
[Timestamp]
public byte[] RowVersion { get; set; }
}

All good answers...From the validation perspective, I also noticed that MaxLength gets validated at the server side only, while StringLength gets validated at client side too.

One another point to note down is in MaxLength attribute you can only provide max required range not a min required range.
While in StringLength you can provide both.

MaxLengthAttribute means Max. length of array or string data allowed
StringLengthAttribute means Min. and max. length of characters that are allowed in a data field
Visit http://joeylicc.wordpress.com/2013/06/20/asp-net-mvc-model-validation-using-data-annotations/

You can use :
[StringLength(8, ErrorMessage = "{0} length must be between {2} and {1}.", MinimumLength = 6)]
public string Address { get; set; }
The error message created by the preceding code would be "Address length must be between 6 and 8.".
MSDN: https://learn.microsoft.com/en-us/aspnet/core/mvc/models/validation?view=aspnetcore-5.0

I have resolved it by adding below line in my context:
modelBuilder.Entity<YourObject>().Property(e => e.YourColumn).HasMaxLength(4000);
Somehow, [MaxLength] didn't work for me.

When using the attribute to restrict the maximum input length for text from a form on a webpage, the StringLength seems to generate the maxlength html attribute (at least in my test with MVC 5). The one to choose then depnds on how you want to alert the user that this is the maximum text length. With the stringlength attribute, the user will simply not be able to type beyond the allowed length. The maxlength attribute doesn't add this html attribute, instead it generates data validation attributes, meaning the user can type beyond the indicated length and that preventing longer input depends on the validation in javascript when he moves to the next field or clicks submit (or if javascript is disabled, server side validation). In this case the user can be notified of the restriction by an error message.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to persist reference data - windows-phone-7

Related

EF 6.2 code first, simple query takes very long

storing business hours in Parse DB

Apache Mahout Database to Sequence File

What's the name of a table of values/frequencies?

StringLength vs MaxLength attributes ASP.NET MVC with Entity Framework EF Code First

Categories

Resources