Highlight Search Results from Elasticsearch using Vue.js - elasticsearch

I am attempting to build a search engine with an Elasticsearch backend and a Vue.js front end. It is primarily based upon this tutorial. It was suggested that a nice feature would be to highlight search results with the initial search term ie if I search "foo" in the search bar, then a search result would look something like foobar.
I have tried placing the v-html block into various divs, but usually to no success.
My elastic index records that match this format:
result._source.Title,
result._source.description,
result._source.contact,
result._source.contactEmail
HTML:
<div class="row">
<div class="col-md-6" v-for="result in results" v-html="highlight(result._source.Title)">
<div class="ul">
<ul>
<li>{{ result._source.Title }},</li>
<li>{{ result._source.description }},</li>
<li>{{ result._source.contact }},</li>
<li>{{ result._source.contactEmail }} </li>
JS: highlight function
highlight(text) {
return text.replace(new RegExp(this.query, 'gi'), '<span class="highlight">$&</span>')
}
}
I have been able to implement basic functionality to highlight a single data point, but the rest of the page does not render. For instance, in this example, title will be highlighted, but the description, contact, and contactEmail will not render on the page. Additionally, if the search term does not match the title, the page errors out.

It's likely due to to the function call in your v-html directive: highlight(result._source.Title).
Instead you should be using v-html to bind to each result property, which allows you to replace occurrences in any of the properties before updating the template.
Try updating your code similar to the following:
new Vue({
el: '#app',
data() {
return {
content: [{
title: 'Result 1',
text: `Phasellus euismod neque diam, aliquam commodo neque venenatis in. Sed non eros lorem. Fusce sit amet gravida nunc. Nunc non pulvinar tellus. Donec rutrum sagittis nulla eu commodo. Morbi condimentum molestie tortor venenatis dignissim. Aenean ac ligula at lectus pharetra sagittis. Integer convallis ipsum ex, ut congue urna auctor in. Sed consequat elit ipsum, eu vestibulum nisl egestas sit amet. Aenean eu mi et metus congue porttitor ut vitae augue. Donec congue semper euismod. Nam eget turpis eros. In vitae viverra eros.`
},
{
title: 'Result 2',
text: `Nunc vehicula lorem a enim pharetra pellentesque. Nullam nulla nisi, imperdiet at blandit in, molestie sed ipsum. Curabitur elit nisl, aliquam vel urna at, tincidunt interdum tortor. Sed lacinia urna non tellus consectetur molestie. Cras nunc justo, suscipit eu luctus eget, viverra at nisl. Curabitur ut sodales justo, sit amet varius arcu. Nulla non varius justo, ut mattis diam. Nam venenatis malesuada enim. Ut id convallis augue. Pellentesque pretium aliquam porttitor. Donec at velit pulvinar, consequat eros at, ultrices urna. Sed non nisl tellus. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Nullam dictum ipsum dolor, vitae volutpat dolor porta non.`
},
{
title: 'Result 3',
text: `Morbi bibendum justo enim, aliquam placerat magna euismod in. Sed ullamcorper augue ac nisl efficitur, eget ullamcorper neque rhoncus. Duis ac tristique orci. Curabitur lorem purus, varius eu sodales non, feugiat auctor orci. Pellentesque feugiat, felis eu accumsan ornare, lacus metus ultrices ipsum, vitae consequat velit neque in sapien. Nunc fringilla sollicitudin hendrerit. Vestibulum at massa convallis, finibus lacus ac, venenatis nulla. Nulla in condimentum metus. Donec sagittis nulla sed elit semper tristique. Vivamus facilisis sed lectus sed semper.`
},
{
title: 'Result 4',
text: `Phasellus suscipit eros ex, sed auctor turpis accumsan non. In ultrices convallis sem id tempor. Sed elementum ac lectus et scelerisque. Mauris vel leo a sem elementum volutpat. Vestibulum congue urna id velit porta, id scelerisque nulla pulvinar. Aliquam sit amet iaculis enim. Vestibulum enim tortor, sodales ut pharetra semper, eleifend sed lectus. Phasellus fringilla leo vel turpis feugiat lacinia. Morbi neque dui, vulputate eget molestie non, hendrerit eu felis. Phasellus erat erat, tempus ut mi ut, maximus dapibus nulla. Phasellus dignissim sollicitudin velit sit amet rhoncus. Curabitur commodo magna eget ex consequat, eget sollicitudin metus rhoncus. Aenean enim libero, dictum nec tempus quis, molestie at nulla.`
},
{
title: 'Result 5',
text: `Mauris ullamcorper mauris nec justo sodales, ac facilisis ipsum fringilla. Nam at urna eu ante luctus dignissim. In sit amet magna aliquam nibh tincidunt luctus vitae at arcu. Proin eu cursus tortor. Proin porttitor erat ac tortor ullamcorper lacinia. Curabitur sit amet ullamcorper ligula, rhoncus euismod est. Praesent non quam fermentum, bibendum lectus vel, auctor enim. Fusce eu viverra lectus. In molestie sit amet velit bibendum accumsan. Donec venenatis, urna sed convallis gravida, est est luctus mi, quis maximus ipsum metus sit amet ex. Vestibulum et nisi eu enim faucibus fermentum. Pellentesque pellentesque ultrices risus vel rutrum. Curabitur hendrerit urna in leo finibus rutrum. Maecenas posuere ultricies lectus eget elementum. Sed lacinia efficitur nisl, ac gravida urna ullamcorper consectetur. Aliquam erat volutpat.`
}
],
search: null
}
},
computed: {
blocks() {
if (this.search) {
const regex = new RegExp(this.search, 'g')
return this.content.map(c => {
return {
title: c.title.replace(regex, `<span class="highlight">${this.search}</span>`),
text: c.text.replace(regex, `<span class="highlight">${this.search}</span>`)
}
})
}
return this.content
}
}
})
.highlight {
background-color: yellow;
}
<script src="https://cdnjs.cloudflare.com/ajax/libs/vue/2.5.17/vue.js"></script>
<div id="app">
<input type="text" v-model="search">
<ul>
<li v-for="(block,i) in blocks" :key="i">
<p>Title: <span v-html="block.title"></span></p>
<p>Text: <span v-html="block.text"></span></p>
</li>
</ul>
</div>

Related

How to ignore URL when searching using ElasticSearch?

Hi,I have a set of documents which may contains some texts, but may have URLs inside them:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam tincidunt metus a convallis imperdiet. Praesent interdum magna ut lorem bibendum vehicula. Maecenas consectetur tortor a ex pulvinar, sit amet sollicitudin nunc maximus. Pellentesque non gravida ligula, imperdiet pharetra odio. Nunc non massa vitae mauris tempor tempus. Nulla ac laoreet tellus. Nulla consequat tortor eu eros euismod bibendum. Curabitur ante ligula, aliquet at lacus at, pretium convallis eros. Fusce id mi condimentum, tempor lorem ut, pharetra libero.
https://document.io/document/ipsum
In eget eleifend neque. Morbi ex leo, tincidunt non enim ut, rutrum suscipit metus. Cras laoreet ex ut massa consequat condimentum. Aenean finibus eu nisl ut rhoncus. Aliquam finibus nisl risus, id facilisis justo rutrum et. Aenean enim libero, commodo id mi ut, mattis sollicitudin tellus. Aliquam molestie ligula sit amet lorem malesuada, aliquet pretium dolor malesuada. Phasellus fringilla libero in sollicitudin tristique. Quisque molestie, enim et aliquam dapibus, ex erat ultrices nisi, luctus ornare lorem metus eu sapien.
I am using a match query to search words inside the document, however, as you can see sometimes the URL has words that are also part of the actual texts. This is messing the result up. I am just wondering if ElasticSearch has a way for me to simply ignore the URLs and just focus on the texts?
I am using english analyzer for this field at this moment.
You can use Pattern replace character filter in your analyzer. For removing URL from your text you can add this filter to your search analyzer:
Filter:
"char_filter": {
"type": "pattern_replace",
"pattern": "\\b(https?|ftp|file)://[-a-zA-Z0-9+&##/%?=~_|!:,.;]*[-a-zA-Z0-9+&##/%=~_|]",
"replacement": ""
}
This filter will replace URL with empty string so you will not get result from URL match.

Match when paragraph contains sentences from indexes with Elasticsearch

I use elasticsearch to create a program allowing to find all the places in a text where the bible is quoted as well as the place where is the verse mentioned
I indexed all the verses of the bible in elasticsearch, each verse is a document
When I do a search by partially typing a verse, I find the right result (even by making mistakes)
How to browse the text to find all the occurrences where a verse (even partial) is cited and thus attribute the source of the verse to them? and tolerating faults (with the fuzziness parameter or using synonyms I think)
Example of my index :
{"index":{"_index":"test","_type":"","_id":1}}
{"fields":{"year":3560,"book":"1","chapter":1,"section":1,"text":"others words consectetur adipiscing and others words"},"id":"test1","type":"add"}
{"index":{"_index":"test","_type":"","_id":2}}
{"fields":{"year":3560,"book":"2","chapter":3,"section":2,"text":"others words a sagittis nisl quam and others words"},"id":"test2","type":"add"}
{"index":{"_index":"test","_type":"","_id":3}}
{"fields":{"year":3560,"book":"3","chapter":1,"section":5,"text":"others words Aliquam ultrices auctor pharetra and others words"},"id":"test3","type":"add"}
{"index":{"_index":"test","_type":"","_id":4}}
{"fields":{"year":3560,"book":"4","chapter":2,"section":4,"text":"others words Proin ut vestibulum and others words"},"id":"test4","type":"add"}
{"index":{"_index":"test","_type":"","_id":5}}
{"fields":{"year":3560,"book":"5","chapter":1,"section":5,"text":"others words Aenean pretium tincidunt aliquet and others words"},"id":"test5","type":"add"}
{"index":{"_index":"test","_type":"","_id":6}}
{"fields":{"year":3560,"book":"6","chapter":2,"section":1,"text":"others words In vitae sagittis and others words"},"id":"test6","type":"add"}
{"index":{"_index":"test","_type":"","_id":7}}
{"fields":{"year":3560,"book":"7","chapter":7,"section":7,"text":"others words ligula laoreet pharetra and others words"},"id":"test7","type":"add"}
{"index":{"_index":"test","_type":"","_id":8}}
{"fields":{"year":3560,"book":"8","chapter":1,"section":4,"text":"others words luctus eros a pretium and others words"},"id":"test8","type":"add"}
{"index":{"_index":"test","_type":"","_id":9}}
{"fields":{"year":3560,"book":"9","chapter":1,"section":7,"text":"others words ullamcorper eu id quam and others words"},"id":"test9","type":"add"}
{"index":{"_index":"test","_type":"","_id":10}}
{"fields":{"year":3560,"book":"10","chapter":5,"section":4,"text":"others words Nullam ac enim ac lacus hendrerit and others words"},"id":"test10","type":"add"}
I need to find all the occurrences in the paragraph which are in the index, in order to recover their sources :
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla rhoncus, nulla vitae porta euismod, purus nisl faucibus nunc, a sagittis nisl quam id arcu. Sed sit amet arcu sed dui auctor bibendum. Proin ut vestibulum sem, id rutrum felis. Phasellus sagittis justo sit amet justo consequat, id scelerisque eros cursus. Quisque dapibus finibus euismod. Proin dui urna, auctor ut gravida quis, fringilla quis velit. Donec sed pulvinar leo. Sed pulvinar pharetra arcu nec egestas. Mauris non dapibus diam. Pellentesque quis pellentesque libero.
Aliquam ultrices auctor pharetra. Cras ullamcorper, odio sit amet aliquam convallis, magna nibh gravida nunc, sit amet volutpat elit purus eget lectus. Pellentesque eu est a risus euismod consequat. Duis id erat porttitor, sodales justo non, aliquet ex. Etiam tincidunt neque ut nisi commodo auctor. Sed congue urna ac tellus scelerisque hendrerit. Mauris lobortis sed dui ut varius.
Proin ac luctus felis. In vitae sagittis erat, nec luctus sapien. Aenean pretium tincidunt aliquet. Morbi at enim vel ligula laoreet pharetra. Sed dignissim luctus eros a pretium. Vestibulum molestie molestie nisi, vitae scelerisque nibh bibendum nec. Donec laoreet sapien sed vehicula dictum. Nullam ac enim ac lacus hendrerit tempor et vitae neque. Quisque at leo pretium, efficitur augue vitae, congue eros. Maecenas volutpat ante nec scelerisque vestibulum.
Donec tristique orci erat, nec imperdiet nulla commodo ut. Nam non odio vel quam cursus ullamcorper eu id quam. Duis volutpat, nisl eu interdum mattis, augue ipsum mollis leo, eget efficitur orci augue eget leo. Integer feugiat facilisis dolor ut vehicula. Maecenas quis feugiat massa. Curabitur feugiat odio eget ligula tincidunt sodales. Donec feugiat dapibus lectus, non maximus dui rhoncus vitae. Phasellus eget massa faucibus, tristique nibh sed, aliquet metus.
I do not know if I have been clear enough but do not hesitate to ask me if you need more precision
I think this problem is handled by the Aho-Corasick algorithm but I don't know how to integrate it into elasticsearch
Thank you!
If I am able to understand your question correctly then all you are looking for is to be able to
"some partial verses" : query
and get the source documents from elasticsearch as response with the results showing the searched verse in them (which is what highlighting is)
Here is the simplest of the query to achieve the same
GET <index_name>/_search
{
"query": {
"match": {
"message": "partial verse"
}
} ,
"highlight" : {
"fields" : {
"message": {}
}
}
}
In response you will get something like this
"hits" : [
{
"_index" : "testSample",
"_type" : "_doc",
"_id" : "TkdvGXAB5bHyIJQ-QRow",
"_score" : 0.2876821,
"_source" : {
"bookName" : "bible",
"message" : "this is a good book"
},
"highlight" : {
"message" : [
"<em>this</em> is a good book"
]
}
}
]
The response is self explanatory , where you get the heighlted results in a different section.

Is the following lossless data compression algorithm theoretically valid?

I am wondering if the following algorithm is a valid lossless data compression algorithm (although not practical with traditional computers, maybe quantum computers?).
At a high and simplified level, the compression steps are:
Calculate the character frequency of the uncompressed text.
Calculate the SHA3-512 (or another hash function) of the uncompressed text.
Concatenate the SHA3-512 and the character frequency (this is now the compressed text that would be written to a file).
And at a high and simplified level, the decompression steps are:
Using the character frequency in the compressed file, generate a permutation of the uncompressed text (keep track of which one).
Calculate the SHA3-512 of the generated permutation in step 1.
If the SHA3-512 calculated in step 2 matches the SHA3-512 in the compressed file, the decompression is complete. Else, go to step 1.
Would it be possible to have a SHA3-512 collision with a permutation of the uncompressed text (i.e. can two permutations of a given character frequency have the same SHA3-512?)? If so, when could this start happening (i.e. after how many uncompressed text characters?)?
One simplified example is as follows:
The uncompressed text is: "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas et enim vitae ligula ultricies molestie at ac libero. Duis dui erat, mollis nec metus nec, porttitor scelerisque enim. Aenean feugiat tellus sit amet facilisis imperdiet. Fusce et nisl porta, aliquam quam eget, mollis sapien. Sed purus est, efficitur elementum quam quis, congue rutrum libero. Etiam metus leo, hendrerit ac dui in, hendrerit blandit sem. Etiam pellentesque enim dapibus luctus volutpat. Praesent aliquet ipsum vitae mauris pulvinar, et pharetra leo semper. Nulla a mauris tellus. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Integer sollicitudin dui sapien, in tempus arcu facilisis in. Vivamus dui dolor, faucibus eu accumsan eu, porttitor id risus. In auctor congue pellentesque. Cras malesuada enim eget est vehicula pretium. Phasellus scelerisque imperdiet lorem, eu euismod lectus convallis consequat. Nam vitae euismod est, vitae lacinia arcu. Praesent fermentum sit amet erat feugiat cursus. Pellentesque magna felis, euismod vel vehicula eu, tincidunt ac ex. Vestibulum viverra justo nec orci semper, nec consequat justo faucibus. Curabitur dignissim feugiat nulla, in cursus nunc facilisis id. Suspendisse potenti. Etiam commodo turpis non fringilla semper. Vivamus aliquam ex non lorem tincidunt, et sagittis tellus placerat. Proin malesuada tortor eu viverra faucibus. Curabitur euismod orci lorem, ut fermentum velit consectetur vel. Nullam sodales cursus maximus. Curabitur nec turpis erat. Vestibulum eget lorem nunc. Morbi laoreet massa vel nulla feugiat gravida. Nulla a rutrum neque. Phasellus maximus tempus neque, eu sagittis ex volutpat ac. Duis malesuada sem vitae lacus suscipit, eu dictum elit euismod. Sed id sagittis leo. Sed convallis nisi nisl, vel pretium elit cursus vel. Duis quis accumsan odio. Ut arcu ex, iaculis a lectus sit amet, lacinia pellentesque enim. Donec maximus ante odio, a porta odio luctus at. Nullam dapibus aliquet sollicitudin. Sed ultrices iaculis blandit. Suspendisse dapibus, odio non venenatis faucibus, justo urna euismod neque, non finibus ante ante in massa. Sed sit amet nunc vel lacus dictum euismod. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Interdum et malesuada fames ac ante ipsum primis in faucibus. Fusce varius lacus velit, venenatis consequat justo rutrum nec. Nunc cursus odio arcu, nec egestas purus feugiat nec. Aliquam efficitur ornare ullamcorper. Mauris consectetur, quam vitae ultricies ullamcorper, nulla nulla tempus risus, aliquet euismod urna erat gravida neque. Suspendisse et viverra enim, ut facilisis enim. Quisque quis elit diam. Morbi quis nulla bibendum, molestie risus egestas, pharetra nisl. Aliquam sed massa dictum, scelerisque odio vel, finibus tellus. Nam tristique commodo sem, a dictum risus euismod sed. Morbi vel urna nec sem consectetur auctor quis ac augue. Donec ac pellentesque tortor. In hendrerit ultricies consequat. Pellentesque non metus vitae elit euismod efficitur in in leo. Nulla ac pulvinar nunc. Donec porttitor nunc ante, et congue augue laoreet ac. Vivamus bibendum id est eleifend efficitur. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc arcu neque, molestie ac lorem id, feugiat efficitur erat. Vestibulum vel condimentum lectus, eu euismod turpis.".
The character frequency is: "⎵:501 e:345 i:277 u:266 s:240 t:226 a:219 l:161 n:154 r:147 m:132 c:128 o:117 d:79 .:64 p:54 ,:47 v:40 q:39 f:35 g:31 b:31 h:11 P:9 N:9 S:8 x:7 D:6 V:6 M:5 I:4 C:4 j:4 L:3 A:3 E:3 F:2 U:1 Q:1".
The SHA3-512 is: "45ebde65cf667d1bfdcf779baab84301c1d4abe60448be821adda9cf7b99b36a61c53233db4a0eda93a04c75201be13bbb638b5e78f5047560fffc97f1c95adb".
The compressed file contents are: "45ebde65cf667d1bfdcf779baab84301c1d4abe60448be821adda9cf7b99b36a61c53233db4a0eda93a04c75201be13bbb638b5e78f5047560fffc97f1c95adb⎵:501 e:345 i:277 u:266 s:240 t:226 a:219 l:161 n:154 r:147 m:132 c:128 o:117 d:79 .:64 p:54 ,:47 v:40 q:39 f:35 g:31 b:31 h:11 P:9 N:9 S:8 x:7 D:6 V:6 M:5 I:4 C:4 j:4 L:3 A:3 E:3 F:2 U:1 Q:1".
Your compression method assumes that there is only one permutation of the given character frequency table that will generate the given hash code. That's provably false.
A 512-bit hash can represent on the order of 1.34E+154 unique values. The number of permutations in a 100-character file is 100!, or 9.33E+157.
Given a 100-character file, there are over 6,900 different permutations for each possible 512-bit hash code.
Using a larger hash code won't help. The number of hash codes doubles with each bit you add, but the number of possible permutations grows more with each character you add to the file.

Image between two div with same text

I should build this layout
layout
but after various tests, I don't understand how place the image.
The text continues from one div to another. I've think to use the column CSS3 property but I think it's not the better solution.
How can I implement this layout?
Thanks in advice.
EDIT:
This is the HTML and CSS code of the last test:
.span11{
width: 90%;
display: block;
margin: 0 auto;
-moz-column-count: 2;
-moz-column-gap: 20pt;
-webkit-column-count: 2;
-webkit-column-gap: 20pt;
column-count: 2;
column-gap: 20pt;
}
#foto{
float: right;
margin-top: 50px;
}
<div class="span11">
<div id="foto">
<img src="http://fpoimg.com/600x400?text=Preview" >
</div>
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec in dui ellus.Pellentesque ac mi neque. Nulla ultricies nulla diam. Nulla luctus risus a ante varius euismod. Fusce viverra molestie enim, malesuada condimentum est consectetur id. Vestibulum laoreet libero vitae metus cursus, a auctor tellus tempor. Suspendisse lacinia tempus metus et lobortis. Suspendisse nec sapien eleifend, viverra lacus ut, pulvinar quam. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec in dui tellus. Pellentesque ac mi neque. Nulla ultricies nulla diam. Nulla luctus risus a ante varius euismod. Fusce viverra molestie enim, malesuada condimentum est consectetur id. Vestibulum laoreet libero vitae metus cursus, a auctor tellus tempor. Suspendisse lacinia tempus metus et lobortis. Suspendisse nec sapien eleifend, viverra lacus ut, pulvinar quam.Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec in dui tellus. Pellentesque ac mi neque. Nulla ultricies nulla diam. Nulla luctus risus a ante varius euismod. Fusce viverra molestie enim, malesuada condimentum est consectetur id. Vestibulum laoreet libero vitae metus cursus, a auctor tellus tempor. Suspendisse lacinia tempus metus et lobortis. Suspendisse nec sapien eleifend, viverra lacus ut, pulvinar quam.Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec in dui tellus. Pellentesque ac mi neque. Nulla ultricies nulla diam. Nulla luctus risus a ante varius euismod. Fusce viverra molestie enim, malesuada condimentum est consectetur id. Vestibulum laoreet libero vitae metus cursus, a auctor tellus tempor. Suspendisse lacinia tempus metus et lobortis. Suspendisse nec sapien eleifend, viverra lacus ut, pulvinar quam. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec in dui tellus. Pellentesque ac mi neque. Nulla ultricies nulla diam. Nulla luctus risus a ante varius euismod. Fusce viverra molestie enim, malesuada condimentum est consectetur id. Vestibulum laoreet libero vitae metus cursus, a auctor tellus tempor. Suspendisse lacinia tempus metus et lobortis. Suspendisse nec sapien eleifend, viverra lacus ut, pulvinar quam. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec in tellus. Pellentesque ac mi neque. Nulla ultricies nulla diam. Nulla luctus risus a ante varius euismod. Fusce viverra molestie enim, malesuada condimentum est consectetur id. Vestibulum laoreet libero vitae metus cursus, a auctor tellus tempor. Suspendisse lacinia tempus metus et lobortis. Suspendisse nec sapien eleifend, viverra lacus ut, pulvinar quam. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec in dui tellus. Pellentesque ac mi neque. Nulla ultricies nulla diam. Nulla luctus risus a ante varius euismod. Fusce viverra molestie enim, malesuada condimentum est consectetur id. Vestibulum laoreet libero vitae metus cursus, a auctor tellus tempor. Suspendisse lacinia tempus metus et lobortis. Suspendisse nec sapien eleifend, viverra lacus ut, pulvinar quam.
</div>
This is Jsfiddle of the last test:
http://jsfiddle.net/DVwws/
After working with the example you found here I was able to edit the source code and eventually create a JSFiddle with your desired result.
General guide to accomplish this:
Create two side-by-side divs filled with text
Use the "Content" CSS property to create "holes" in your paragraphs where they are desired with content: ""; width: 125px; height: 250px;.
Use absolute positioning to place an image within that "hole" you created.
This image should assist in understanding the placing concept: (Just imagine the green-section as being the hole and centered vertically)
Here is the HTML and CSS from the JSFiddle I made:
<style>
#page-wrap { width: 100%; margin: 80px auto; position: relative; }
#logo { position: absolute; top: 125px; left: 50%; margin-left: -125px; }
.left, .right { width: 49%; text-align: justify}
.left { float:left; }
.right { float:right; }
#l, #r { width: 100%; position: relative;}
#l { float: left; text-align: justify}
#r { float: right; text-align: justify}
#l:before, #r:before { content: ""; width: 150px; height: 250px; vertical-align:-50%;}
#l:before { float: right;}
#r:before { float: left; }
</style>
AND
<div id="page-wrap">
<img src="http://placekitten.com/250/250" id="logo">
<div class="left">Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Vestibulum tortor quam, feugiat vitae, ultricies eget, tempor sit amet, ante. Donec eu libero sit amet quam egestas semper. Aenean ultricies mi vitae est. Mauris placerat eleifend leo. Quisque sit amet est et sapien ullamcorper pharetra. Vestibulum erat wisi, condimentum sed, commodo vitae, ornare sit amet, wisi. <div id="l">Aenean fermentum, elit eget tincidunt condimentum, eros ipsum rutrum orci, sagittis tempus lacus enim ac dui. Donec non enim in turpis pulvinar facilisis. Ut felis. Praesent dapibus, neque id cursus faucibus, tortor neque egestas augue, eu vulputate magna eros eu erat. Aliquam erat volutpat. Nam dui mi, tincidunt quis, accumsan porttitor, facilisis luctus, metuPellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Vestibulum tortor quam, feugiat vitae, ultricies eget, tempor sit amet, ante. Donec eu libero sit amet quam egestas semper. Aenean ultricies mi vitae est. Mauris placerat eleifend leo. Quisque sit amet est et sapien ullamcorper pharetra. Vestibulum erat wisi, condimentum sed, commodo vitae, ornare sit amet, wisi. Aenean fermentum, elit eget tincidunt conum, eros ipsum rutrum orci, sagittis tempus lacus enim ac dui. Donec non enim in turpis pulvinar facilisis. Ut felis. Praesent dapibus, neque id cursusdiment faucibus, tortor neque egestas augue, eu vulputate magna eros eu erat. </div>Aliquam erat volutpat. Nam dui mi, tincidunt quis, accumsan porttitor, facilisis luctus, metus
</div>
<div class="right">Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Vestibulum tortor quam, feugiat vitae, ultricies eget, tempor sit amet, ante. Donec eu libero sit amet quam egestas semper. Aenean ultricies mi vitae est. Mauris placerat eleifend leo. Quisque sit amet est et sapien ullamcorper pharetra. Vestibulum erat wisi, condimentum sed, commodo vitae, ornare sit amet, wisi. <div id="r">Aenean fermentum, elit eget tincidunt condimentum, eros ipsum rutrum orci, sagittis tempus lacus enim ac dui. Donec non enim in turpis pulvinar facilisis. Ut felis. Praesent dapibus, neque id cursus faucibus, tortor neque egestas augue, eu vulputate magna eros eu erat. Aliquam erat volutpat. Nam dui mi, tincidunt quis, accumsan porttitor, facilisis luctus, metus
Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Vestibulum tortor quam, feugiat vitae, ultricies eget, tempor sit amet, ante. Donec eu libero sit amet quam egestas semper. Aenean ultricies mi vitae est. Mauris placerat eleifend leo. </div>Quisque sit amet est et sapien ullamcorper pharetra. Vestibulum erat wisi, condimentum sed, commodo vitae, ornare sit amet, wisi. Aenean fermentum, elit eget tincidunt condimentum, eros ipsum rutrum orci, sagittis tempus lacus enim ac dui. Donec non enim in turpis pulvinar facilisis. Ut felis. Praesent dapibus, neque id cursus faucibus, tortor neque egestas augue, eu vulputate magna eros eu erat. Aliquam erat volutpat. Nam dui mi, tincidunt quis, accumsan porttitor, facilisis luctus, metus
</div>
</div>
Also since you said you were loading your text dynamically from a data-base here is an easy way to calculate it's length and break it into two equal chunks.
<?php
//some SQL Queries setting $str
$len = strlen($str);
$part1 = substr($str, $len/2);
$part2 = substr($str, $len/2+1,$len);
// Insert $part1 & $part2 text chunks into each div.
?>
NOTE: Due to the exact cut, if that character number is in the middle of a word that word will be cut into two parts. There are resources to easily figure out how to cut on the next space; but that does not pertain to this specific question.

how to get value of height or width attribute of img tag using xpath at a given position

For the below XML how to get the value of height or width attribute by giving index.
<root>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer non nunc vitae nisl luctus pharetra at eu nulla. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.
<img src="http://www.google.com/logos/pacman10-hp.png"/> Nullam in odio at ligula euismod adipiscing convallis in justo. Donec at massa nulla, at facilisis magna. Integer sit amet elit eu felis venenatis dignissim. In ut mi leo. Suspendisse blandit faucibus fermentum. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Phasellus ultricies turpis id magna semper vestibulum.
</p>
<p>Quisque blandit pretium libero, venenatis pellentesque purus egestas id. Integer nulla ante, pellentesque eget rhoncus sed, semper vel eros. Nam placerat est et est dictum egestas. Ut gravida blandit lacus rhoncus feugiat. Nunc ut euismod eros. Pellentesque sit amet vehicula mauris. Quisque in nulla quis sapien dictum mattis. Curabitur vehicula lorem ac elit dignissim egestas. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Cras sit amet tincidunt quam.
<img src="http://www.google.com/logos/2010/gabor10-hp.png"/> Ut urna neque, mollis vel tempor placerat, cursus vel enim.
</p>
<p>Praesent gravida dignissim sagittis. Vivamus dictum nisi pulvinar augue vulputate euismod. Vestibulum arcu sapien, laoreet sagittis pulvinar ac, porttitor a tellus.
<img width="100" height="100" src="http://www.google.com/logos/2010/d4g_worldcup10_ko-hp.jpg"/> Quisque cursus dignissim libero in convallis. Fusce cursus nisi ut felis feugiat sodales. Praesent nec arcu purus. Donec lorem lectus, tristique eget faucibus sit amet, bibendum nec ipsum. Mauris tempus laoreet tortor non egestas. Aliquam erat volutpat. Aliquam erat volutpat. Phasellus a arcu convallis nibh luctus tempor non quis sem.
<img src="http://www.google.com/logos/2010/d4g_worldcup10_uk-hp.jpg"/> Aliquam ac risus velit, ut sodales justo. Ut eget lacus eget nisi hendrerit gravida quis et nibh. Etiam purus felis, fermentum a cursus at, congue vel eros. Aenean semper, sapien eget eleifend fermentum, odio sem tempor dolor, sed porta ligula nunc ac tellus.
</p>
<p>Mauris volutpat nisi vitae sem imperdiet sed ultricies est dictum. Mauris id urna turpis, sit amet rhoncus lectus. Maecenas vitae mi at nulla mattis congue id blandit purus.
<img src="http://www.google.com/logos/2010/d4g_worldcup10_nl-hp.jpg"/> Maecenas hendrerit, dui eget faucibus pretium, tellus augue pellentesque metus, id molestie diam arcu ac nibh. Suspendisse sollicitudin viverra blandit. Maecenas sed tellus quis purus bibendum eleifend. Nunc sodales magna id nulla tristique et suscipit purus interdum. Ut at risus quam, nec rutrum risus. Integer ac leo lorem, eget porta nisi. Sed quis lacus dapibus massa commodo ornare. Mauris scelerisque rutrum accumsan. Duis fermentum adipiscing mi eget suscipit. Duis quis nisi libero, iaculis fermentum purus. Etiam risus nibh, tincidunt pellentesque luctus sed, gravida vitae magna.
<img src="http://www.google.com/logos/2010/d4g_worldcup10_au-hp.jpg"/> Sed laoreet, erat id rutrum dignissim, elit libero fermentum enim, pretium auctor lectus urna vitae nulla. Nullam ante diam, elementum nec elementum quis, consectetur eget arcu.
</p>
<p>Fusce eu nisl risus. Fusce rhoncus iaculis viverra. Curabitur eleifend, nisl sed aliquam dapibus, urna leo scelerisque orci, id commodo dui libero vitae nisi.</p>
<img WIDTH="100" HEIGHT="100" src="http://www.google.com/logos/2010/d4g_worldcup10_nl-hp.jpg"/>
</root>
i tried with
//img[1]/#width
but not working. Basically i need a XPATH for to get height or width of an img tag irrespective of case(WIDTH or width) and if the width attribute is not available it should return no match or null
Use:
(//img)[$k]/#*[name() = 'width' or name() = 'WIDTH']
Where you need to replace $k ith the desired image index.
This selects the attribute named "width" or named "WIDTH" of the $k-th img element in the XML document.
For example, for the 3rd image use:
(//img)[3]/#*[name() = 'width' or name() = 'WIDTH']

Resources