how to extract file names with .txt from a text? - linq

I have a text like below,
Lorem ipsum dolor sit amet, consectetur sample1.txt adipiscing elit. Morbi nec urna non ante varius semper eget vitae ipsum. Pellentesque habitant sample2.txt morbi tristique senectus et netus et malesuada fames.
I have sample1.txt and sample2.txt in the above text. Name vary from sample1 and sample2. i just need to fetch the file name using c#.
Can anyone please help me on this ?

Since you tagged it LINQ:
var filesnames = text.Split(new char[] { }) // split on whitespace into words
.Where(word => word.EndsWith(".txt"));

Try something like this
var filesnames = text.Split(' ')
.Where(o => o.EndsWith(".txt")).Select(o => o.SubString(o.LastIndexOf('.'))).ToList();

It may be possible with a regular expression if there's a good way to capture what your file names will look like. I'm assuming here it's always blah.txt with alphanumeric characters:
var matches = Regex.Matches(input, #"\b[a-zA-Z0-9]+\.txt\b");

Related

Parsing lines of text from external file in Ruby

I am trying to parse a raw email. The desired result is a hash of the lines that contain specific headers.
This is the Ruby file:
raw_email = File.open("sample-email.txt", "r")
parsed_email = Hash.new('')
raw_email.each do |line|
puts line
header = line.chomp(":")
puts header
if header == "Delivered-To"
parsed_email[:to] = line
elsif header == "From"
parsed_email[:from] = line
elsif header == "Date"
parsed_email[:date] = line
elsif header == "Subject"
parsed_email[:subject] = line
end
end
puts parsed_email
And this is the raw email:
Delivered-To: user1#example.com
From: John Doe <user2#example.com>
Date: Tue, 12 Dec 2017 13:30:14 -0500
Subject: Testing the parser
To: user1#example.com
Content-Type: multipart/alternative;
boundary="123456789abcdefghijklmnopqrs"
--123456789abcdefghijklmnopqrs
Content-Type: text/plain; charset="UTF-8"
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer nec
odio. Praesent libero. Sed cursus ante dapibus diam. Sed nisi. Nulla
quis sem at nibh elementum imperdiet. Duis sagittis ipsum.
--123456789abcdefghijklmnopqrs
Content-Type: text/html; charset="UTF-8"
<div dir="ltr">Lorem ipsum dolor sit amet, consectetur adipiscing
elit. Integer nec odio. Praesent libero. Sed cursus ante dapibus diam.
Sed nisi. Nulla quis sem at nibh elementum imperdiet. Duis sagittis
ipsum.<br clear="all">
</div>
--089e082c24dc944a9f056028d791--
The puts statements are just for my own testing to see if data is being passed along.
What I am getting is each full line put twice and an empty hash put at the end.
I have also tried changing different bits to strings or arrays and I've also tried using line.split(":", 1) instead of line.chomp(":")
Can someone please explain why this isn't working?
Try this
raw_email = File.open("sample-email.txt", "r")
parsed_email = {}
raw_email.each do |line|
case line.split(":")[0]
when "Delivered-To"
parsed_email[:to] = line
when "From"
parsed_email[:from] = line
when "Date"
parsed_email[:date] = line
when "Subject"
parsed_email[:subject] = line
end
end
puts parsed_email
=> {:to=>"Delivered-To: user1#example.com\n", :from=>"From: John Doe <user2#example.com>\n", :date=>"Date: Tue, 12 Dec 2017 13:30:14 -0500\n", :subject=>"Subject: Testing the parser\n"}
Explanation
You need to split line on : and select first. Like this line.split(":")[0]

Paginating imported XML Data

All,
I'm having a hell of a time getting this to work. I have a very basic XML structure:
<root>
<item>
<header>NEW HEADER</header>
<body>NEW BODY - Sed auctor justo et erat rutrum, nec molestie neque placerat. Quisque efficitur condimentum velit nec volutpat. Nunc sed magna vel mauris convallis sodales</body>
<footer>NEW - Footer: Donec in nibh risus. Sed placerat felis non pellentesque placerat. In non risus a elit malesuada consectetur.</footer>
</item>
<item>
<header>NEW HEADER 2</header>
<body>NEW BODY - Sed auctor justo et erat rutrum, nec molestie neque placerat. Quisque efficitur condimentum velit nec volutpat. Nunc sed magna vel mauris convallis sodales</body>
<footer>NEW - Footer: Donec in nibh risus. Sed placerat felis non pellentesque placerat. In non risus a elit malesuada consectetur.</footer>
</item>
</root>
I've created an InDesign template with tagged text-area placeholders. What I want to achieve is create a new page for each <item> tag and populate the data appropriately. When I load my XML, it loads each <item> but it doesn't generate a new page for each one.
Any help would be appreciated.
that's because you need to understand some basic rules. Number one is that xml is just about text within InDesign. In your case, your template has to dispose from a generic set of tags and a page break character. You will ask InDesign to duplicate that set and character at every occurence of the repeated incoming node. I wrote a blog post that talk about all those peculiarities. Especially for rookies ;) : http://www.ozalto.com/en/5-errors-you-will-do-with-indesign-xml/
You'll want to take a look at the "Merge Mode" section of Adobe's Importing XML documentation here:
https://helpx.adobe.com/indesign/using/importing-xml.html
From that page:
Merge mode not only makes automated layout possible, it provides more
advanced import options, including the ability to filter incoming text
and clone elements for repeating data.
it sounds like you need the "clone elements" feature.
To get new page for each <item> put a page break at the end of <item>
Then make sure to set a "Primary Text Frame" on your master page.
https://helpx.adobe.com/indesign/using/whats-new-cs6.html#id_16192
With this set, InDesign will simply create a new page as needed.

Wrapping lines with leading speechmark in Sublime

When using the „Wrap paragraph at XX characters“ AND the praragraph starts with leading speech marks, these will be set in front of every line in this paragraph.
How to reproduce:
write a text (or generate one at lipsum.com) with on or more paragraphs.
set the first word (or words) in speechmarks: "Lorem ipsum" dolor...
mark this paragraph
select Edit >> Wrap >> Wrap paragraph at 70 characters
You'll get something like
"Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras
"scelerisque sagittis est, ac consectetur lorem mattis tincidunt.
"Vestibulum sit amet lobortis odio, vel sollicitudin enim. Vivamus
"imperdiet tortor eu est malesuada condimentum. Suspendisse et posuere
"urna. Vivamus consequat sapien id vestibulum auctor. Vivamus suscipit
...
Can I somehow change this behaviour in the user settings or do I have to treat this as a bug?
To fix this, you'll have make a small edit to paragraph.py in the Packages/Default folder. How you do this depends on the version of Sublime you're using. If you're using ST2, simply select Preferences -> Browse Packages... to open the Packages folder in your operating system's file explorer. Go to the Default folder, and open paragraph.py in Sublime.
If you're using ST3, the process is slightly more complicated, as most packages are stored in zipped .sublime-package files. Install Package Control if you haven't already, then install the PackageResourceViewer plugin. In the Command Palette, select PackageResourceViewer: Open Resource, select Default, then select paragraph.py.
Once you have paragraph.py open, scroll down to line 188:
prefix = self.extract_prefix(s)
Change it to:
prefix = None # self.extract_prefix(s)
and save the file. Now, this text (with a single quote at the beginning):
becomes:

Getting the MoreLinq MaxBy function to return more than one element

I have a situation in which I have a list of objects with an int property and I need to retrieve the 3 objects with the highest value for that property. The MoreLinq MaxBy function is very convenient to find the single highest, but is there a way I can use that to find the 3 highest? (Not necessarily of the same value). The implementation I'm currently using is to find the single highest with MaxBy, remove that object from the list and call MaxBy again, etc. and then add the objects back into the list once I've found the 3 highest. Just thinking about this implementation makes me cringe and I'd really like to find a better way.
Update: In version 3, MaxBy (including MinBy) of MoreLINQ was changed to return a sequence as opposed to a single item.
Use MoreLINQ's PartialSort or PartialSortBy. The example below uses PartialSortBy to find and print the longest 5 words in a given text:
var text = #"
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Etiam gravida nec mauris vitae sollicitudin. Suspendisse
malesuada urna eu mi suscipit fringilla. Donec ut ipsum
aliquet, tincidunt mi sed, efficitur magna. Nulla sit
amet congue diam, at posuere lectus. Praesent sit amet
libero vehicula dui commodo gravida eget a nisi. Sed
imperdiet arcu eget erat feugiat gravida id non est.
Nam malesuada nibh sit amet nisl sollicitudin vestibulum.";
var words = Regex.Matches(text, #"\w+");
var top =
from g in words.Cast<Match>()
.Select(m => m.Value)
.GroupBy(s => s.Length)
.PartialSortBy(5, m => m.Key, OrderByDirection.Descending)
select g.Key + ": " + g.Distinct().ToDelimitedString(", ");
foreach (var e in top)
Console.WriteLine(e);
It will print:
14: malesuadafsfjs
12: sollicitudin
11: consectetur, Suspendisse
10: adipiscing, vestibulum
9: malesuada, fringilla, tincidunt, efficitur, imperdiet
in this case, you could simply do
yourResult.OrderByDescending(m => m.YourIntProperty)
.Take(3);
Now, this will retrieve you 3 objects.
So if you've got 4 objects sharing the same value (which is the max), 1 will be skipped. Not sure if that's what you want, or if it's not a problem...
But MaxBy will also retrieve only one element if you have many elements with the same "max value".

How to highlight multiple selections?

For example I have some text in ace-editor and a list of ranges of rows and lines in text where highlightings should happened. Like this (they're bolded):
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Nam cursus.
Morbi ut mi. Nullam enim leo, egestas id, condimentum at, laoreet mattis,
massa. Sed eleifend nonummy diam. Praesent mauris ante, elementum et,
bibendum at, posuere sit amet, nibh.
How to highlight these words by using ace-editor API?
How to highlight multiple lines?
Finally I've got the answer.
Highlight the word:
var range = new Range(rowStart, columnStart, rowEnd, columnEnd);
var marker = editor.getSession().addMarker(range,"ace_selected_word", "text");
Remove the highlighted word:
editor.getSession().removeMarker(marker);
Highlight the line:
editor.getSession().addMarker(range,"ace_active_line","background");

Resources