Let's say I have a list of URLs separated by a white space with their corresponding titles.
http://url1.com/qfwarsas/ gb_title 1 - 1
http://url2.com/arsas/ xe_title 2 - 2
http://url3.com/qfsas ah_title 3 - 3
I'm trying to sort the lines by the titles to look like this:
http://url3.com/qfsas ah_title 3 - 3
http://url1.com/qfwarsas/ gb_title 1 - 1
http://url2.com/arsas/ xe_title 2 - 2
I can do it by running a simple macro to copy out the first letter of each title to the front of the line, then ctrl+v sort the blocks, then remove the first letters of each line. I wonder if there's a way to do it using regex and visual block selection?
Regex to get title first letters selection is
:s/\v[^ ]* (.)/\1/
but when i try to convert that into visual block selection i'm running into issues.
Any ideas?
If your separator is a white space, you can use
:sort / /
The default behavior of :sort using a search pattern is to sort on whatever follows the match.
Related
I have a google sheets documents with data in this format:
Some data 10:5 Somemore Data
I am trying to separate the text from the numbers in separate columns based on the colon sign so that the output looks like this:
Some data | 10 | 5 | Somemore Data
I tried the SPLIT and RIGHT/LEFT functions but I can't get it to work.
This is what I have so far
=LEFT(C2,FIND(":",C2)-3)
This separates the text on the LEFT but using it on the right side doesn't work. My formula also doesn't separate the numbers. Looking for a formula that can achieve the above desired result.
My spreadsheet - https://docs.google.com/spreadsheets/d/1EmL4kzCGxRbwvNJntwMokqgt8yjjAqnZuUidTbZe6Z8/edit?usp=sharing
Thanks.
There is already a solution in your shared sheet with SPLIT and REGEXREPLACE.
Here is one a bit simpler with REGEXEXTRACT:
=ARRAYFORMULA(IF(A2:A="", "", REGEXEXTRACT(A2:A,"^(.+?)[ ]+(\d+)[ ]*:[ ]*(\d+)[ ]+(.+)$")))
Every group will be a cell in a row to the right.
Regex description and demo: link.
Edit: stripped spaces. You have a nasty chars in your strings - nonbreaking space bar which is indistinguishable from the regular space. Could not understand why a simpler regex (^(.+?)\s+(\d+)\s*:\s*(\d+)\s+(.+)$) did not work. All because of this nbsp (char 160). Thus [ ] (nbsp and a regular space) instead of just \s.
I have a problem with writing an regex (in Ruby, but I don't think that it changes anything) that selects all proper hashtags.
I used ( /(^|\s)(#+)(\w+)(\s|$)/ ), which doesn't work and I have no idea why.
In this example:
#start #middle #middle2 #middle3 bad#example #another#bad#example #end
it should mark #start, #middle, #middle2, #middle3 and #end.
Why doesn't my code work and how should a proper regex look?
As for why the original does not work lets look at each bit
(^|\s) Start of line or white space
(#+) one or more #
(\w+) one or more alphanumeric characters
(\s|$) white space or end of line
The main problem is a conflict between 1 and 4. When 1 matches white space that white space was already matched in the last group as part 4. So 1 does not exist and the match moves to the next possible
4 is not really needed since 3 will not match white space.
So here is the result
(?:^|\s)#(\w+)
https://regex101.com/r/iU4dZ3/3
does [^#\w](#[\w]*)|^(#[\w]*) works?
getting an # not following a character, and capturing everything until not a word.
the or case handle the case where the first char is #.
Live demo: http://regexr.com/3al01
How's this work for you?
(#[^\s+]+)
This says find a hash tag then everything until a whitespaces.
One more regex:
\B#\w+\b
This one doesn't capture whitespaces...
https://regex101.com/r/iU4dZ3/4
I am attempting to seperate blocks of Japanese text into individual sentences using regex. Right now I'm mostly experimenting on rubular but here is what I have so far.
regex: /(.*?(。|?|!))/
sample text
強面のため周囲の人から敬遠されている主人公が、クラスメイトと共通の話題を持とうとVRMMORPG「アナザーワールド」のベータテストに申し込んだ。ところが当選したのは彼一人。しかたなくひとりでゲーム内の仮想世界「イストピア」に「ケイオス」と名乗って乗り込んだが、そこはゲームでありながら五感すべてを体感でき、現実と間違えるほどのリアルな世界だった。サポートAIのテミスの協力を得つつ、クエストをこなしていったが、実はそこは本物の異世界「イストピア」であり、ケイオスのこなしたクエストによって、多くの人が影響を受けて……というお話。その戯言、聞き飽きたわ!あれ、ここにあった筆入れはどこにやったの?
The results im getting are correct however it is also separately matching the punctuation characters
How can I improve my regular expression so that the punctuation mark isn't separately matched?
Using (.*?[。?!]) seems to do the trick, check on rubular
Match 1
1. 強面のため周囲の人から敬遠されている主人公が、クラスメイトと共通の話題を持とうとVRMMORPG「アナザーワールド」のベータテストに申し込んだ。
Match 2
1. ところが当選したのは彼一人。
Match 3
1. しかたなくひとりでゲーム内の仮想世界「イストピア」に「ケイオス」と名乗って乗り込んだが、そこはゲームでありながら五感すべてを体感でき、現実と間違えるほどのリアルな世界だった。
Match 4
1. サポートAIのテミスの協力を得つつ、クエストをこなしていったが、実はそこは本物の異世界「イストピア」であり、ケイオスのこなしたクエストによって、多くの人が影響を受けて……というお話。
Match 5
1. その戯言、聞き飽きたわ!
Match 6
1. あれ、ここにあった筆入れはどこにやったの?
What about this?
str.scan /[\p{Han}\p{Katakana}\p{Hiragana}\p{Hangul}[[:punct:]]]+/
=> ["強面のため周囲の人から敬遠されている主人公が、クラスメイトと共通の話題を持とうと",
"「アナザ",
"ワ",
"ルド」のベ",
"タテストに申し込んだ。ところが当選したのは彼一人。しかたなくひとりでゲ",
"ム内の仮想世界「イストピア」に「ケイオス」と名乗って乗り込んだが、そこはゲ",
"ムでありながら五感すべてを体感でき、現実と間違えるほどのリアルな世界だった。サポ",
"ト",
"のテミスの協力を得つつ、クエストをこなしていったが、実はそこは本物の異世界「イストピア」であり、ケイオス のこなしたクエストによって、多くの人が影響を受けて……というお話。その戯言、聞き飽きたわ!あれ、ここにあった筆入れはどこにやったの?"]
http://rubular.com/r/8CtYuV8AAl
I wonder how can I preserve consecutive newline characters with Ruby here-document? In my program all of them are collapsed to one newline. For example:
s=<<END
1
2
3
4
END
evaluates to:
s="1\n2\n3\n4\n"
However I would like to preserve the consecutive newlines when for example formatting a BBcode document a letter or something similar.
That looks like a bug to me. Have you tried a multiline %q?
s=%q(1
2
3
4
)
Can I sort lines in vim depending on a part of line and not the complete line?
e.g
My Name is Deus Deceit
I would like to sort depending on the column that the name starts + 6 columns
for example
sort by column 19-25 and vim will only check those characters for sorting.
If it can be done without a plugin that would be great. ty
Check out :help :sort. The command takes an options {pattern} whose matched text is skipped (i.e. sorting happens after the match.
For example, to sort by column 19+ (see :help /\%c and the related regexp atoms):
:sort /.*\%19c/