I am tangling with a project involving enormous text log files on the server and it is getting logistically difficult moving files around so I can get at them with my main programming editors. It has become clear that I need to install editors on the servers. Installing full devel tools is out of the question, and frankly, I'm a bit stumped as I guess I've mostly been using Visual Studio and Eclipse so long I just don't know much about the current state of editors out there.
I sure bet someone here on StackOverflow does!
I need something that can deal with large (multi GB) text files with efficiency and decent multiline regular expression support is a must. Other than those features, minimum footprint and bloat is better than more features. Windows Server 2008 R2. Of course free, or really reasonable price like shareware. Anyone have some suggestions?
Microsoft Log Parser:
Log parser is a powerful, versatile
tool that provides universal query
access to text-based data such as log
files, XML files and CSV files, as
well as key data sources on the
Windows® operating system such as the
Event Log, the Registry, the file
system, and Active Directory®.
http://www.microsoft.com/downloads/en/details.aspx?FamilyID=890cd06b-abf8-4c25-91b2-f8d975cf8c07&displaylang=en
I have used it for parsing huge log files
Some examples: http://support.microsoft.com/kb/910447
Related
In most of the games and programs you download, you just get the installer.
Some .exe files can be ran straightly, though (it's probably cause they don't have much source files to extract, huh?).
I was wondering, what's the difference between an installer, that just extracts the files, and a zip (rar, iso..) file, that you could download ,just depending on your internet speed, in up to few seconds. And where does a, maybe 200mb, installer fetch the, let's say 5gb of, files, offline?
I've never heard about this, and I'm learning to program, so I'd appreciate if you could answer me properly.
What you're really asking is:
How does an installer work?
A bit of background.
In the Before Times, man did not have such things as "installers." Software was run directly off of floppy disks (and none of that rigid 3.5" crap, I'm talking disks that flopped), like God intended.
Then came the first home computers with persistent hard drives. For the first time, it made sense to copy a program off a disk and have it stick around.
But programs still worked the way "portable" applications do today: you copied them as-is and ran them as-is.
Then operating systems began to get more complicated.
Windows introduced this notion of a registry: a central location where program and operating system configuration could be stored. Software authors began using this registry. Its arcane architecture and user-hostile editing utility (the infamous regedit.exe) made it the perfect place to store shareware information -- how many days you have left on your trial, for example.
This happened around the same time that programs began to be too large to fit -- uncompressed -- on a single floppy disk. A way was needed to split a program onto multiple disks. Since it wasn't very user-friendly to require the user to have e.g. a ZIP extractor installed (remember, this was before ubiquitous Internet), Windows programs began to be shipped with installers. You can think of these as basically portable versions of WinZIP whose sole purpose was to reassemble and extract a compressed file.
These days, installers serve a number of other purposes:
providing a convenient user interface
prompting the user to accept a click-through end-user license agreement (EULA)
prompting the user for CD keys (though this is being phased out for many systems in favor of digital distribution)
asking the user to register their software
and so on. They may also serve as DRM vehicles, validating CDs and decrypting data to prevent villainous individuals (yarr) from brrreakin' ye olde DMCA.
At their heart, they aren't any more complex than in the Windows 95 days -- a glorified unzip program.
Sidenote: Where does the installer get 5GB of data from 200MB of archives if not the Internet?
That's high, though there are plenty of ways you could get that compression ratio. Imagine a complex game whose world is defined in verbose XML -- that's readily compressible. You could even get that back in the old WinZIP days.
A zip file can only hold some files and then you unzip and get those files as is.
An installer however can be a very complicated program. It can create the needed files or folders structures, It can register the required dlls on your system, give you the options of the features that can be installed, Check your system for the compatibility and also be used as a wizard to guide you, step by step, to custom install you application.
An Installer (esp. Windows Installer) can make automatic Registry entries, as well as unpack and write files to a directory. With the Zip, you have to manually extract the files, and get no automatic registry edits.
The advantage to a zip is that it guarantees (most of the time) that the application is portable, that all necessary files are included in the unzipped directory.
The advantage of an installer is pretty obvious: automated, UI.
As for the 200mb -> 5gb....compressing the files into an exe can add another layer of more/better/smaller compression than that of just simply throwing the files into a zipped folder, however 200mb -> 5gb is a pretty big jump, not impossible, just pretty big. For most installers that do have instructions for large external (online) downloads, they typically let you know before hand that they are about to download a large chunk of data and to not disconnect from the internet during install....
An Installer or EXE Can Be Easily Get Affected By Virus But if there is ZIP archive than there are less chances for virus affection and using zip is more flexible too because it can be protected using you own password too.
Another Normal Benefit is that ZIP compress the files too.
Hope You are getting me.
When I install a program I need to know what files were added/modified, which registry was modified. Can someone suggest a program that does this or maybe a code?
I think, this tool makes exactly what you need: http://technet.microsoft.com/en-us/sysinternals/bb896645
Process Monitor is an advanced monitoring tool for Windows that shows
real-time file system, Registry and process/thread activity. It
combines the features of two legacy Sysinternals utilities, Filemon
and Regmon, and adds an extensive list of enhancements including rich
and non-destructive filtering, comprehensive event properties such
session IDs and user names, reliable process information, full thread
stacks with integrated symbol support for each operation, simultaneous
logging to a file, and much more. Its uniquely powerful features will
make Process Monitor a core utility in your system troubleshooting and
malware hunting toolkit.
Systracer perfectly do what you want:
SysTracer is a system utility tool that can scan and analyze your
computer to find changed (added, modified or deleted) data into
registry and files.
There is both a free and a paid versions.
http://www.blueproject.ro/systracer
When I develop web applications I'm frequently need to sync files from a working folder to external server or another folder. I like keeping my code separated from the web sever.
In open source world there is the eclipse with file sync that does the job pretty well. Unfortunately I can't find any good replacement for Visual Studio.
I've only found two generic solutions:
- Winscp which is pretty good but stucks when a file is locked and ask for confirmation. Which is quite annoying.
- DSynchronize which works pretty well (ie. doesn't ask questions) but doesn't have filters so I can't tell it not to sync my .svn files or web.conf :(.
Do you know any good way to achieve realtime synchronization in Visual Studio or windows?
I doens't have to have gui in fact I would love to see a command line solution like a powershell command that outputs modified files.
I've ended up using Mercurial (to skip the .svn files) and DSynchronize to sync files
I would give a try to immortal classic - rsync. There is cygwin enabled implementation for Windows called cwrsync: http://www.itefix.no/i2/node/10650 . With proper configuration (potentially with some fine tuning with scripting as well) it will do perfectly.
If you would like to have bi directional synchronization, the Unison may be the answer:
http://www.cis.upenn.edu/~bcpierce/unison/
If you are looking for something even fancier, you might give a try to one of distributed file systems available, like CODA (I'm afraid decent Windows systems aren't supported yet): http://www.coda.cs.cmu.edu or native DFS solution from Microsoft, however I'm afraid the set up is too hassling (if not impossible in your case) since it's targeted for enterprise solutions:
http://technet.microsoft.com/en-us/library/cc753479(WS.10).aspx
Of course DFS option probably won't support filtering you are interested in.
We've been using VSS 6.0 since time began, but yesterday I nabbed VSS2005 off of our MSDN subscription, it wouldn't let me install it off the ISO through Daemon Tools (not sure why, but I submitted error report to MS...). I noticed it had a program files directory right on the ISO, so I just copied the folder onto my hard drive. Well, I opened up the client and behold, a glamorous version of VSS 6.0 connected to the exact same DB.
Anyone know if I'm going to destroy everything by using it?
We moved from VSS6 to VSS2005 just over a year ago. The database structure is identical. The only caveat we found was if some people still used VSS6 on a database where others were using VSS2005. VSS2005 treats Unicode text files as text files, whereas VSS6 does not. Which means that when VSS2005 adds a Unicode text file, VSS6 sees it as binary (this affects csproj files among others).
Other than that, VSS2005 supports proper HTTP access to the database (provided server extensions are installed), improved LAN performance (again, with server extensions), and better file system dialogs (the nasty old ones are gone). However, the new file add dialog shows ALL files, not just the ones that aren't included.
Also, VSS2005 allows the provision of custom editors and differencing tools by file extension, which is very useful. For example, some of our XML files are encrypted, so we run a decryption tool before the difference tool by using this system, which has increased the efficiency of our review processes substantially.
There are also other tweaks here and there, mostly good but occasionally annoying.
Finally, nothing has been destroyed. In fact, there appears to have been less additional corruption in the database since the transition - but I wouldn't put this down to the new VSS as it wasn't a comprehensive test.
I'm pretty sure, that there is no more danger of destroying anything than when using VSS 6.0.
It's quite a long time ago since I last used VSS, but we also updated from version 6 to version 2005. As far as I remember, there were only some cosmetic changes in the client (VSS explorer), but the format of the database and also the available feature were exactly the same than in VSS 6.
You should be fine.
Since VSS just uses a file share for everything, and there's nothing that is really server based, you're fine. Not much has changed in the format of the database, mostly client side stuff.
Back in the old days, Help was not trivial but possible: generate some funky .rtf file with special tags, run it through a compiler, and you got a WinHelp file (.hlp) that actually works really well.
Then, Microsoft decided that WinHelp was not hip and cool anymore and switched to CHM, up to the point they actually axed WinHelp from Vista.
Now, CHM maybe nice, but everyone that tried to open a .chm file on the Network will know the nice "Navigation to the webpage was canceled" screen that is caused by security restrictions.
While there are ways to make CHM work off the network, this is hardly a good choice, because when a user presses the Help Button he wants help and not have to make some funky settings.
Bottom Line: I find CHM absolutely unusable. But with WinHelp not being an option anymore either, I wonder what the alternatives are, especially when it comes to integrate with my Application (i.e. for WinHelp and CHM there are functions that allow you to directly jump to a topic)?
PDF has the disadvantage of requiring the Adobe Reader (or one of the more lightweight ones that not many people use). I could live with that seeing as this is kind of standard nowadays, but can you tell it reliably to jump to a given page/anchor?
HTML files seem to be the best choice, you then just have to deal with different browsers (CSS and stuff).
Edit: I am looking to create my own Help Files. As I am a fan of the "No Setup, Just Extract and Run" Philosophy, i had that problem many times in the past because many of my users will run it off the network, which causes exactly this problem.
So i am looking for a more robust and future-proof way to provide help to my users without having to code a different help system for each application i make.
CHM is a really nice format, but that Security Stuff makes it unusable, as a Help system is supposed to provide help to the user, not to generate even more problems.
HTML would be the next best choice, ONLY IF you would serve them from a public web server. If you tried to bundle it with your app, all the files (and images (and stylesheets (and ...) ) ) would make CHM look like a gift from gods.
That said, when actually bundled in the installation package, (instead of being served over the network), I found the CHM files to work nicely.
OTOH, another pitfall about CHM files: Even if you try to open a CHM file on a local disk, you may bump into the security block if you initially downloaded it from somewhere, because the file could be marked as "came from external source" when it was obtained.
I don't like the html option, and actually moved from plain HTML to CHM by compressing and indexing them. Even use them on a handful of non-Windows customers even.
It simply solved the constant little breakage of people putting it on the network (nesting depth limited, strange locking effects), antivirus that died in directories with 30000 html files, and 20 minutes decompression time while installing on an older system, browser safety zones and features, miscalculations of needed space in the installer etc.
And then I don't even include the people that start "correcting" them, 3rd party product with faulty "integration" attempts etc, complaints about slowliness (browser start-up)
We all had waited years for the problems to go away as OSes and hardware improved, but the problems kept recurring in a bedazzling number of varieties and enough was enough. We found chmlib, and decided we could forever use something based on this as escape with a simple external reader, if the OS provided ones stopped working and switched.
Meanwhile we also have an own compiler, so we are MS free future-proof. That doesn't mean we never will change (solutions with local web-servers seem favourite nowadays), but at least we have a choice.
Our software is both distributed locally to the clients and served from a network share. We opted for generating both a CHM file and a set of HTML files for serving from the network. Users starting the program locally use the CHM file, and users getting their program served from a network share has to use the HTML files.
We use Help and Manual and can thus easily produce both types of output from the same source project. The HTML files also contain searching capabilities and doesn't require a web server, so though it isn't an optimal solution, works fine.
So far all the single-file types for Windows seems broken in one way or another:
WinHelp - obsoleted
HtmlHelp (CHM) - obsoleted on Vista, doesn't work from network share, other than that works really nice
Microsoft Help 2 (HXS) - this seems to work right up until the point when it doesn't, corrupted indexes or similar, this is used by Visual Studio 2005 and above, as an example
If you don't want to use an installer and you don't want the user to perform any extra steps to allow CHM files over the network, why not fall back to WinHelp? Vista does not include WinHlp32.exe out of the box, but it is freely available as a download for both Vista and Server 2008.
It depends on how import the online documentation is to your product, a good documentation infrastructure can be complex to establish but once done it pays off. Here is how we do it -
Help source DITA compilant XML, stored in SCC (ClearCase).
Help editing XMetal
Help compilation, customized Open DITA Toolkit, with custom Perl/Java preprocessing
Help source cross references applications resources at compile time, .RC files etc
Help deliverables from single source, PDF, CHM, Eclipse Help, HTML.
Single source repository produces help for multiple products 10+ with thousands of shared topics.
From what you describe I would look at Eclipse Help, its not simple to integrate into .NET or MFC applications, you basically have to do the help mapping to resolve the request to a URL then fire the URL to Eclipse Help wrapper or a browser.
Is the question how to generate your own help files, or what is the best help file format?
Personally, I find CHM to be excellent. One of the first things I do when setting up a machine is to download the PHP Manual in CHM format (http://www.php.net/download-docs.php) and add a hotkey to it in Crimson Editor. So when I press F1 it loads the CHM and performs a search for the word my cursor is on (great for quick function reference).
If you are doing "just extract and run", you are going to run in security issues. This is especially true if you are users are running Vista (or later). is there a reason why you wanted to avoid packaging your applications inside an installer? Using an installer would alleviate the "external source" problem. You would be able to use .chm files without any problems.
We use InstallAware to create our install packages. It's not cheap, but is very good. If cost is your concern, WIX is open source and pretty robust. WIX does have a learning curve, but it's easy to work with.
PDF has the disadvantage of requiring the Adobe Reader
I use Foxit Reader on Windows at home and at work. A lot smaller and very quick to open. Very handy when you are wondering what exactly a80000326.pdf is and why it is clogging up your documents folder.
I think the solution we're going to end up going with for our application is hosting the help files ourselves. This gives us immediate access to the files and the ability to keep them up to date.
What I plan is to have the content loaded into a huge series of XML files, each one containing help for a specific item. This XML would contain links to other XML files. We would use XSLT to display the contents as necessary.
Depending on the licensing, we may build a client-specific XSLT file in order to tailor the look and feel to what they need. We may need to be able to only show help for particular versions of our product as well and that can be done by filtering out stuff in the XSLT.
I use a commercial package called AuthorIT that can generate a number of different formats, such as chm, html, pdf, word, windows help, xml, xhtml, and some others I have never heard of (does dita ring a bell?).
It is a content management system oriented towards the needs of technical documentation writers.
The advantage is that you can use and re-use the same content to build a set of guides, and then generate them in different formats.
So the bottom line relative to the question of choosing chm or html or whatever is that if you are using this you are not locked into a given format, but you can provide several among which the user can choose, and you can even add more formats as you go along, at no extra cost.
If you just have one guide to create it won't be worth your while, but if you have a documentation set to manage then it is the best to my knowledge. Their support is very helpful also.