go to post Benjamin De Boe · Jun 18, 2018 Hi Eduard,for this sort of querying (and many other uses outside straight API calls), you can use the SQL projections generated for your domain, as documented in %iKnow.Tables.Utils. That'll generate a column for your Views metadata field on the table containing Source information, which you can then join to the Part (entity occurrence) table to filter the ones containing the requested entity.Hope this helps,benjamin
go to post Benjamin De Boe · Jun 14, 2018 Hi Eduard,looking at your code, there seem to be a few small things that may each contribute to not seeing the results you were expecting:the MaintenanceAPI:GetBlackListElements() call returns its results as result(n) = $lb(id, string) with n just an incrementing integer representing the row number. At the other end, the ContainsEntityFilter expects array(string) or a $listbuild(string1, string2, ...). So your filter might be selecting sources containing the strings "1", "2", etcSourceAPI:GetByDomain() returns result(n) = $lb(sourceID, externalID). That source ID is an internally generated integer ID that has no links to your source table Text.Data. The external ID is typically composed of what you selected as group field and identifier field when loading from a SQL table. So depending on how you set up your domain, that may indeed be the ID field of your Text.Data table. It looks like you have the "simple external IDs" feature switched on, which is why your external IDs only consist of the identifier field, making things indeed easier (but usually only useful/safe when loading from a single table!). Note that this is slightly different for DeepSee-managed domains, where the source ID equals the external ID and corresponds to DeepSee's fact ID, but ignore this confusing comment when not using DeepSeeFinally, and likely irrelevant, you're passing in $$$YES when initializing filterNot. I'm not sure where you're loading that macro from, but that should be a %Boolean with a value of 1 to work as expected, where a string value would translate to a %Boolean with value 0.Hope this helps,benjamin
go to post Benjamin De Boe · May 4, 2018 We currently don't support analytic windowing functions (PARTITION BY syntax), but have been looking into it for a future release. MATCH_RECOGNIZE is certainly one of the more advanced ones in that bucket. Is this the very one you would need or do you have scenarios that would be served by core windowing functionality, excluding the pattern matching piece?Or is it the pattern matching and not as much the windowing you're looking for?
go to post Benjamin De Boe · May 4, 2018 I'm afraid we don't support the SQL PIVOT command, so unless you can enumerate the response codes as columns explicitly, you can only organise them as rows. If you control the application code, you could of course first have a query selecting all response codes and then generating the lengthy SQL call that includes separate columns for each response code. Something like SUM(CASE bRecord.ResponseCode WHEN 'response code 1' THEN 1 ELSE 0 END) AS ResponseCode1Count should work fairly well.
go to post Benjamin De Boe · Mar 27, 2018 Horita-san,The SetParameter() method requires your domain has an ID assigned, which it gets automatically as soon as you call %Save() a first time. Note that it should be returning an error in the sequence of commands you pasted, but it went unnoticed because of the do syntax.In general, it should be more convenient to work with Domain Definitions rather than the %iKnow.Domain API directly.Thanks,benjamin
go to post Benjamin De Boe · Sep 19, 2017 Hi Steve,hadn't seen this question until just now, but I have to admit we're a bit storage-hungry with iKnow. If you generate the default full set of indices, for a moderately-sized domain you'll need up to 25x the original dataset size measured as raw text to fit everything. This can drop to half that size (12x) if you forsake all non-essential indices, but that will prevent a number of queries from running smoothly or, in some cases, disable them completely.For iFind, the numbers are dependent on the type of index. Count on factors 2x, 7x and 15x for Basic, Semantic and Analytic indices, respectively. Of course there's a difference in functionality between all these options and it's best to start from a set of functional requirements and then look at which particular approach covers those.These numbers are somewhat conservative maximums and, as Eduard already suggested, you may see different (lower) numbers depending on the nature of your data. A more detailed sizing guide is available on request.Thanks,benjamin
go to post Benjamin De Boe · Jul 3, 2017 Thanks John,indeed, you'd need a proper license in order to work with iKnow. If the method referred above would return 0, please contact your sales representative to request a temporary trial license and appropriate assistance for implementing your use case.Also, iKnow doesn't come as a separate namespace. You can create (regular) namespaces as you prefer and use them to store iKnow domain data. You may need to enable your web application for iKnow, which is disabled by default for security reasons in the same way DeepSee is. See this paragraph here for more details.
go to post Benjamin De Boe · Jun 26, 2017 Hi Eduard,you can define iFind indices for calculated fields, so if you point your field calculation to a function that strips out the HTML, you should be fine. The HTML converter in iKnow was built for a slightly different purpose, but can be used here:Property HtmlText As %String(MAXLEN="");Property PlainText As %String(MAXLEN="") [ Calculated, ReadOnly, SqlComputed, SqlComputeCode = { set {PlainText} = ##class(%iKnow.Source.Converter.Html).StripTags({HtmlText}) } ];Regards,benjamin
go to post Benjamin De Boe · Jan 5, 2017 Hi Benjamin,The (patented) magic of iKnow is the way how it identifies concepts in sentences and happens in a library shipped as a binary, which we refer to as the iKnow engine and is used by both the iKnow APIs and iFind indices. Most of what happens with that engine's output is not nearly as much rocket science and as Eduard indicated, its COS source code can usually be consulted for clues on how it works if you're adventurous. The two options of the GetSimilar() query both work by looking at the top concepts of the reference source and look for other sources that have them as well, using frequency and dominance for weighting in-source relevance for the two options respectively. So not much rocket science and only support for full matches at this point. This said, iKnow offers you the building blocks to build much more advanced things, quite possibly inspired by your academical research, leveraging the concept level that is unique to iKnow in identifying what a text is really about. For example, you can build vectors containing entity frequency or dominance and look for cosine similarity in this vector space, or you can leverage topic modelling, but many of these will require quite a bit of computation and actual result quality may depend a bit on the nature of the texts you're dealing with, which is why we chose to stick to very simple things in the kit for now.However, you can find two (slightly) more advanced options in demos we have published online:The iKnow Investigator demo is part of your SAMPLES namespace and offers, next to the SourceAPI:GetSimilar() option, also an implementation that builds a dictionary from your reference source, matches it against your target sources and looks for the ones with the highest aggregated match score, which accounts for partial matches as well.In the iFind Search Portal demo, the bottom of the record viewer popup displays a list of similar records as well. This one is populated based on a bit of iFind juggling implemented in the Demo.SearchPortal.Utils class, leveraging full entity matches only, but easy to extend to weigh in words as well.In both cases, there's a myriad of options to refine these algorithms, but all at a certain compute cost, given the high dimensionality introduced by the iKnow entity (and actually even word) level. If you have further ideas or, better yet, sample code to achieve better similar document lists, we'd be thrilled to read about it here on the community ;o)Thanks,benjamin
go to post Benjamin De Boe · Dec 22, 2016 Hi Edward,the thing that comes closest here would be the %iKnow.Queries.SourceAPI:GetSimilar() query, which for a certain seed document, looks for the most similar ones in a domain, optionally constrained by a filter object. The results of that query include a figure like the one you're looking for, expressing how many entities were new in the seed document vs the corpus it's comparing against. Although that that particular calculation isn't available as an atomic function, a simple way to get to what you want would be to use the %iKnow.Filters.SourceIdFilter and just compare against an individual document.If you prefer to write more code :o), you can just look up the entities in the one document and compare them against those in the others through the %iKnow.Objects.EntityInSourceDetails SQL projection.Regards,benjamin