How do I get 64 bit ids to work with Sphinx search server 0.9.9 on Mac OS? - macos

I've been using Sphinx successfully for a while, but just ran into an issue that's got me confused... I back Sphinx with mysql queries and recently migrated my primary key strategy in a way that had the ids of the tables I'm indexing grow larger than 32 bits (in MYSQL they're bigint unsigned). Sphinx was getting index hits, but returning me nonsense ids (presumably 32 bits of the id returned by the queries or something)..
I looked into it, and realized I hadn't passed the --enable-id64 flag to ./configure. No problem, completely rebuilt sphinx with that flag (I'm running 0.9.9 by the way). No change though! I'm still experiencing the exact same issue. My test scenario is pretty simple:
MySQL:
create table test_sphinx(id bigint unsigned primary key, text varchar(200));
insert into test_sphinx values (10102374447, 'Girls Love Justin Beiber');
insert into test_sphinx values (500, 'But Small Ids are working?');
Sphinx conf:
source new_proof
{
type = mysql
sql_host = 127.0.0.1
sql_user = root
sql_pass = password
sql_db = testdb
sql_port =
sql_query_pre =
sql_query_post =
sql_query = SELECT id, text FROM test_sphinx
sql_query_info = SELECT * FROM `test_sphinx` WHERE `id` = $id
sql_attr_bigint = id
}
index new_proof
{
source = new_proof
path = /usr/local/sphinx/var/data/new_proof
docinfo = extern
morphology = none
stopwords =
min_word_len = 1
charset_type = utf-8
enable_star = 1
min_prefix_len = 0
min_infix_len = 2
}
Searching:
→ search -i new_proof beiber
Sphinx 0.9.9-release (r2117)
...
index 'new_proof': query 'beiber ': returned 1 matches of 1 total in 0.000 sec
displaying matches:
1. document=1512439855, weight=1
(document not found in db)
words:
1. 'beiber': 1 documents, 1 hits
→ search -i new_proof small
Sphinx 0.9.9-release (r2117)
...
index 'new_proof': query 'small ': returned 1 matches of 1 total in 0.000 sec
displaying matches:
1. document=500, weight=1
id=500
text=But Small Ids are working?
words:
1. 'small': 1 documents, 1 hits
Anyone have an idea about why this is broken?
Thanks in advance
-Phill
EDIT
Ah. Okay, got further. I didn't mention that I've been doing all of this testing on Mac OS. It looks like that may be my problem. I just compiled in 64 bit on linux and it works great.. There's also a clue when I run the Sphinx command line commands that the compile didn't take:
My Mac (broken)
Sphinx 0.9.9-release (r2117)
Linux box (working)
Sphinx 0.9.9-id64-release (r2117)
So I guess the new question is what's the trick to compiling for 64 bit keys on mac os?

Did you rebuild the index with 64 bits indexer?

Related

Change or personalize OID

Finally managed to have my RPI computer module to work with SNMP.
I have a script running that gives me one of my parameters and if I use query using SNMP I get the info back.
pi#raspberrypi:~ $ snmpwalk -v2c -c public localhost NET-SNMP-EXTEND-MIB::nsExtendObjects | grep snmp_status
NET-SNMP-EXTEND-MIB::nsExtendCommand."snmp_status" = STRING: /home/pi/BDC/snmp_status.py
NET-SNMP-EXTEND-MIB::nsExtendArgs."snmp_status" = STRING:
NET-SNMP-EXTEND-MIB::nsExtendInput."snmp_status" = STRING:
NET-SNMP-EXTEND-MIB::nsExtendCacheTime."snmp_status" = INTEGER: 5
NET-SNMP-EXTEND-MIB::nsExtendExecType."snmp_status" = INTEGER: exec(1)
NET-SNMP-EXTEND-MIB::nsExtendRunType."snmp_status" = INTEGER: run-on-read(1)
NET-SNMP-EXTEND-MIB::nsExtendStorage."snmp_status" = INTEGER: permanent(4)
NET-SNMP-EXTEND-MIB::nsExtendStatus."snmp_status" = INTEGER: active(1)
NET-SNMP-EXTEND-MIB::nsExtendOutput1Line."snmp_status" = STRING: 0
NET-SNMP-EXTEND-MIB::nsExtendOutputFull."snmp_status" = STRING: 0
NET-SNMP-EXTEND-MIB::nsExtendOutNumLines."snmp_status" = INTEGER: 1
NET-SNMP-EXTEND-MIB::nsExtendResult."snmp_status" = INTEGER: 0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."snmp_status".1 = STRING: 0
If my unit is in alarm replies with
NET-SNMP-EXTEND-MIB::nsExtendOutLine."snmp_status".1 = STRING: 1
if not in alarm replies with
NET-SNMP-EXTEND-MIB::nsExtendOutLine."snmp_status".1 = STRING: 0
This status is stored in a file and it's parsed to the SNMP using a python script.
Now... next question.
The SNMP server gives me the following OID
.1.3.6.1.4.1.8072.1.3.2.3.1.2.11.115.110.109.112.95.115.116.97.116.117.115
and for each parameter it gives me one very different IOD.
How can I change this for something more easy... like the ones we see on MIB files?
If you are doing it in the command line, use
snmptranslate -m NET-SNMP-EXTEND-MIB .1.3.6.1.4.1.8072.1.3.2.3.1.2.11.115.110.109.112.95.115.116.97.116.117.115
To do it purely programmatically (i.e. without parsing command line output), you will need a way to parse the MIB files. I think such tools probably exist in Python, but I've never used them myself.
More often, I hard-code constants for the OIDs that I'm interested in, and manually inspect the MIB to know how to decode the index for each object. The OID you gave is an instance of NET-SNMP-EXTEND-MIB::nsExtendOutputFull, which belongs to nsExtendOutput1Entry. Normally the *Entry types will have an INDEX field telling you which field is used as the index of that table. In this case, it has an AUGMENTS field instead, which points you to nsExtendConfigEntry. The INDEX fornsExtendConfigEntry is nsExtendToken, which has a type of DisplayString (basically an OCTET STRING that is limited to human-readable characters).
Here's an example of how I would do this in Python -- you'll need pip install snmp:
from snmp.types import OID, OctetString
nsExtendOutputFull = OID.parse(".1.3.6.1.4.1.8072.1.3.2.3.1.2")
oid = OID.parse(".1.3.6.1.4.1.8072.1.3.2.3.1.2.11.115.110.109.112.95.115.116.97.116.117.115")
nsExtendToken = oid.extractIndex(nsExtendOutputFull, OctetString)
print(f"Index = {nsExtendToken}")
Here's the output:
Index = OctetString(b'snmp_status')

'Limit of total fields crossed more than 1000 Elasticsearch exception

My Elasticsearch index has more than 1000 fields due to my Sql schema and I get below exception:
{'type': 'illegal_argument_exception', 'reason': 'Limit of total
fields [1000] in index }
And my bulk insert looks like this:
with open('audit1.txt') as file:
for line in file:
columns = line.split(r'||')
dict['TimeStamp']=columns[0].strip('\'')
dict['BusinessTimeStamp']=columns[1].strip('\'')
dict['RuntimeMicroflowID']=columns[2].strip('\'')
dict['MicroflowID']=columns[3].strip('\'')
dict['UserId']=columns[4].strip('\'')
dict['ClientId']=columns[5].strip('\'')
dict['Userlocation']=columns[6].strip('\'')
dict['Transactionid']=columns[7].strip('\'')
dict['Catagorie']=columns[8].strip('\'')
dict['EventType']=columns[9].strip('\'')
dict['Operation']=columns[10].strip('\'')
dict['PrimaryData']=columns[11].strip('\'')
dict['SecondayData']=columns[12].strip('\'')
i=13
while i < len(columns):
tempdict['BFOLDVALUE'] = columns[i+1].strip('\'')
tempdict['BFNEWVALUE'] = columns[i+2].strip('\'')
if columns[i].strip('\'') is not None:
dict[columns[i].strip('\'')] = tempdict.copy()
i+=3
tempdict.clear()
#print(json.dumps(dict,indent = 4))
batch.append(dict)
if counter==BATCHSIZE:
try:
helpers.bulk(es, batch, index='audit-index', doc_type='audit')
insertedrecords+=counter
counter = 0
batch.clear()
print(insertedrecords," - Records Has Been inserted ")
except BulkIndexError:
print("Error Occured -- continuing")
print(json.dumps(dict,indent = 4))
print(BulkIndexError)
batch.clear()
break
counter+=1
dict.clear()
So, I am assuming I am trying to index this wrongly... is there a better way of indexing this kind of formats in elasticsearch? Note than I am using ELK version 7.5.
Here is the sample file I am parsing to elasticsearch:
2018.07.17/15:41:53.735||2018.07.17/15:41:53.735||'0164a8424fbbp84h%2139165'||'BT_TTB_CashDep_PRC'||'eskedarz'||'UXP'||'00001039'||'0164a842e519pJpA'||'Persistence'||''||'CREATE'||'DailyTxns'||'0164a842e4eapJnu'||'CurrentThread'||'WebContainer : 15'||''||'ParentThread'||'system'||''||'TCPWorkerThreadID'||'WebContainer : 15'||''||'f_POSTINGDT'||'2018-07-17'||''||'versionNum'||'0'||''||'f_TXNAMTDR'||'0'||''||'f_ACCOUNTID'||'013XXXXXXXXX0'||''||'f_VALUEDTTM'||'2018-07-17 15:41:53.0'||''||'f_POSTINGDTTM'||'2018-07-17 15:41:53.692'||''||'f_TXNCLBAL'||'25551.610000'||''||'f_TXNREF'||'0000103917071815410685326'||''||'f_PIEVENTTYPE'||'N'||''||'f_TXNAMT'||'5000.00'||''||'f_TRANSACTIONID'||'0164a842e4e9pJng'||''||'f_TYPE'||'N'||''||'f_USERID'||'xxxarz'||''||'f_SRNO'||'1'||''||'f_TXNBASEEQ'||'5000.00'||''||'f_TXNSRCBRANCH'||'0000X039'||''||'f_TXNCODE'||'T08'||''||'f_CHANNELID'||'BranchTeller'||''||'f_TXNAMTCR'||'5000.00'||''||'f_TXNNARRATION'||'SELF '||''||'f_ISACCRUALPENDING'||'false'||''||'f_TXNDTTM'||'2018-07-17 15:41:53.689'||''
if you carefully look at this part of the error message it would be clear.
Limit of total fields [1000] in index
1000 is the default limit of total fields in the Elasticsearch index as shown in their source code.
public static final Setting<Long> INDEX_MAPPING_TOTAL_FIELDS_LIMIT_SETTING =
Setting.longSetting("index.mapping.total_fields.limit", 1000L, 0, Property.Dynamic, Property.IndexScope);
Please note this is a dynamic setting, hence can be changed on a given index, by updating index setting
PUT test_index/_settings
{
"index.mapping.total_fields.limit": 1500. --> changed it to what is suitable for your index.
}
More info on this issue can be found here and here.
better way to handle such exploding index is to normalize as RDBMS that means store some of the key : value combinations in a nested structure
example
{"keyA":"ValueA","keyB":"ValueB","keyC":"ValueC"...} - record to
{"keyA":"ValueA","Keyvalue":{"keyB":"ValueB"
"keyC":"ValueC"}} - record
so searching would look like Keyvalue.Value == KeyB and KeyValue.Value = ValueB

How to convert visual foxpro 6 report to word

I have used following code to show a report.
select mem_no,com_name,owner,owner_cate,iif(empty(photo),"c:\edrs\memphoto\void.JPG",iif(file(photo),photo,"c:\edrs\memphoto\void.JPG")) as photo from own1 into curs own
REPO FORM c:\edrs\reports\rptsearch.frx TO PRINT PREVIEW NOCONS
Here rptsearch.frx contains some image. The following code export data to excel except image.
COPY TO "c:\documents and settings\administrator\desktop\a.xls" TYPE XLS
In case of image it shows only the path name. Now I need to know how I can convert this report in word so that I can have the images in the word report.
It looks like that you are creating a simple list with pictures. One of the easiest ways to do that is to use automation. ie:
Select mem_no,com_name,owner,owner_cate,;
iif(Empty(photo) Or !File(photo),"c:\edrs\memphoto\void.JPG",photo) As photo ;
from own1 ;
into Curs crsData ;
nofilter
#Define wdWord9TableBehavior 1
#Define wdAutoFitWindow 2
#Define wdStory 6
#Define wdCollapseEnd 0
#Define wdCellAlignVerticalCenter 1
#Define CR Chr(13)
Local Array laCaptions[5]
laCaptions[1] = 'Mem No'
laCaptions[2] = 'Com Name'
laCaptions[3] = 'Owner'
laCaptions[4] = 'Owner Cate'
laCaptions[5] = 'Photo'
Local nRows, nCols, ix
nRows = Reccount('crsData')+1
nCols = Fcount('crsData')+1
oWord = Createobject('Word.Application')
With m.oWord
oDocument = .Documents.Add
With m.oDocument.Tables.Add( m.oWord.Selection.Range, m.nRows, m.nCols)
.BorderS.InsideLineStyle = .F.
.BorderS.OutsideLineStyle = .F.
For ix=1 To Alen(laCaptions)
**** Add captions *****
.Cell(1,m.ix).Range.InsertAfter( laCaptions[m.ix] )
Endfor
Select crsData
Scan
For ix=1 To Fcount()-1 && last one is photo path
**** Add values to the different cells *****
.Cell(Recno()+1,m.ix).Range.InsertAfter( Eval(Field(m.ix)) )
Endfor
lcPhoto = crsData.photo
If File(m.lcPhoto) && Add photo if any
oDocument.InlineShapes.AddPicture( m.lcPhoto, .T., .T.,;
.Cell(Recno()+1,Fcount()).Range)
Endif
.Rows(Recno()+1).Cells.VerticalAlignment = wdCellAlignVerticalCenter
Endscan
Endwith
.Visible = .T.
Endwith
However, sending data to word this way would suffer from performance if you have many rows. You can use this for small data like an employee table or so. With larger data, instead of using automation, you could simply create an HTML document and word would open an HTML document.
There is no native way to do it. I would investigate FoxyPreviewer and use that to report to RTF, which Word can open.
Or do it the other way round with a mail merge in Word.
In addition to FoxyPreviewer, you could also use OLE Office automation to programmatically build the report. There are numerous examples online, and even a book, Microsoft Office Automation, written by Tamar E. Granor & Della Martin.
I have not done a lot with automation, only enough to get it basically verify it works, and discover it was very slow for what I was attempting to do.

setTargetReturn in fPortfolio package

I have a three asset portfolio. I need to set the target return for my second asset
whenever i try i get this error
asset.ts <- as.timeSeries(asset.ret)
spec <- portfolioSpec()
setSolver(spec) <- "solveRshortExact"
constraints <- c("Short")
setTargetReturn(Spec) = mean(colMeans(asset.ts[,2]))
efficientPortfolio(asset.ts, spec, constraints)
Error: is.numeric(targetReturn) is not TRUE
Title:
MV Efficient Portfolio
Estimator: covEstimator
Solver: solveRquadprog
Optimize: minRisk
Constraints: Short
Portfolio Weights:
MSFT AAPL NORD
0 0 0
Covariance Risk Budgets:
MSFT AAPL NORD
Target Return and Risks:
mean mu Cov Sigma CVaR VaR
0 0 0 0 0 0
Description:
Sat Apr 19 15:03:24 2014 by user: Usuario
i have tried and i have searched the web but i have no idea how to set the target return
for a specific expected return of the data set. i could copy the mean of my second asset # but i think due to decimal it could affect the answer.
I ran into this error , when using 2 assets.
Appears to be a bug in the PortOpt methods.
When there's 2 assets, it runs : .mvSolveTwoAssets
Which looks for the TargetReturn in the portfolioSpecs.
But as you know, targetReturn isn't always needed.
But in your code , you have 2 separate variables for spec.
'spec' , and 'Spec'
i.e.: 'Spec' .. assuming this is a typo, then this line needs to be changed.
setTargetReturn(Spec) = mean(colMeans(asset.ts[,2]))

Ruby multi-line regex

I have a ruby multi-line string (called efixes) that looks like:
ID STATE LABEL INSTALL TIME UPDATED BY ABSTRACT
=== ===== ========== ================= ========== ======================================
1 S hayo32.02 xxxxxxx xxxxxxxx xxxxxxxxxxxxxxx
2 S 23434.23 xxxxxxx xxxxxxxx xxxxxxxxxxxxxxx
STATE codes:
S = STABLE
M = MOUNTED
U = UNMOUNTED
Q = REBOOT REQUIRED
B = BROKEN
I = INSTALLING
R = REMOVING
T = TESTED
P = PATCHED
N = NOT PATCHED
SP = STABLE + PATCHED
SN = STABLE + NOT PATCHED
QP = BOOT IMAGE MODIFIED + PATCHED
QN = BOOT IMAGE MODIFIED + NOT PATCHED
RQ = REMOVING + REBOOT REQUIRED
I only want to display the lines that start with a number. I am having trouble, it doesn't seem to be matching. I found this solution here, (that I don't truly understand right now):
efixes_array = efixes.split("\n").select{|x| /\A[0-9]/.match(x)}
io.puts efixes_array.collect{|x| x.scan(/\A[0-9]/)}.flatten
It is only matching the numbers. I want to display the entire line. The end result, I want to display what is under the "LABELS" column.
This line from your example code
efixes.split("\n").select{|x| /\A[0-9]/.match(x)}
returns an array with all lines that start with a number.

Resources