From: jasondugdale on
Hi,
I'm working on a project using Oracle Text, and wondering if it is
possible display "relevant" parts of any documents returned by a
search (in a similar fashion to Google).

Heres a summary of what I have so far:

CREATE TABLE news(
news_id NUMBER, --Primary key: starts from "1"
name VARCHAR2(80), --Name of document
document CLOB --Document itself
);

INSERT INTO News(name, document)
VALUES('Councils warned over spying laws', 'Councils in England have
been urged to review the way they use surveillance powers to
investigate suspected crime. Under laws brought in to help fight
terrorism, councils can...');


If searching for "terrorism" for example, I can happily return the
name of the document, and using the SUBSTR and INSTR operators, can
return a substring from the document (either a substring starting at
the start of the document, or ending at the end of the document).
I would like to be able to return the part of the document relevant to
the "terrorism" search criteria (Under laws......help fight terrorism)
and was wondering if this was possible, as I could not find anything
in the Oracle 10g documentation about how to do it.

I have set up a text index to index all the entries in the "News"
table, which seems to work fine:

create index news_index
on news (document)
indextype is ctxsys.context;


I have also written a package containing the following procedure used
for entering the search keyword (p_string), and displaying the search
results:

procedure proc1
(p_string in varchar2 default null)
is

i number;
op varchar2(200);
pos number;

mklob clob;
v_amt number := 30000;
v_doc long;

begin

icp.p('Search within document');

icp.hr;
icp.formopen('jason.proc1','get');
icp.formtext('p_string',cvalue=>p_string);
icp.formsubmit;
icp.formclose;


begin

icp.hr;
i := 0;
for newsrec in (select name, news_id from news
where contains (document, p_string) > 0)

loop

i := i + 1;
icp.bold(newsrec.name);
icp.br;
pos := 1;

select substr(document, 1, instr(document, ' ', pos, 20)-1) into
op
from news
where contains (document, p_string) > 0;

icp.p(op || ' ...');

icp.br;

select count(news_id) into i from news;

icp.p(i || ' result(s) returned from search');

end;

exception
when others then
icp.p(sqlcode || ': ' || sqlerrm(sqlcode));

end proc1;

Any help would be greatly appreciated,
Thanks,
dugjason
From: Dan Blum on
jasondugdale(a)hotmail.co.uk wrote:
> Hi,
> I'm working on a project using Oracle Text, and wondering if it is
> possible display "relevant" parts of any documents returned by a
> search (in a similar fashion to Google).

Look up CTX_DOC.HIGHLIGHT in Oracle Text Reference.

--
_______________________________________________________________________
Dan Blum tool(a)panix.com
"I wouldn't have believed it myself if I hadn't just made it up."
From: jasondugdale on
>
> Look up CTX_DOC.HIGHLIGHT in Oracle Text Reference.
>

Hi,

I built a CTX_DOC.HIGHLIGHT operation, and inserted it at the end of
my loop, but it keeps returning DRG-10011: invalid object name, and
ORA-00931: missing identifier, errors. Here's what I wrote:

CTX_DOC.SET_KEY_TYPE('PRIMARY_KEY');

CTX_DOC.HIGHTLIGH(INDEX_NAME => 'news_index',
TEXTKEY =>
TO_CHAR(newsrec.news_id),
TEXT_QUERY => p_string,
RESTAB => mklob,
PLAINTEXT => TRUE);
From: jasondugdale on
Oops - in the previous post, I realise I have mis-spelled "HIGHLIGHT"
- this was just an error when copying across - it was spelled
correctly at compile time
From: BicycleRepairman on
On Jun 27, 8:31 am, jasondugd...(a)hotmail.co.uk wrote:
> Hi,
> I'm working on a project using Oracle Text, and wondering if it is
> possible display "relevant" parts of any documents returned by a
> search (in a similar fashion to Google).
>
> Heres a summary of what I have so far:
>
> CREATE TABLE news(
> news_id NUMBER, --Primary key: starts from "1"
> name VARCHAR2(80), --Name of document
> document CLOB --Document itself
> );
>
> INSERT INTO News(name, document)
> VALUES('Councils warned over spying laws', 'Councils in England have
> been urged to review the way they use surveillance powers to
> investigate suspected crime. Under laws brought in to help fight
> terrorism, councils can...');
>
> If searching for "terrorism" for example, I can happily return the
> name of the document, and using the SUBSTR and INSTR operators, can
> return a substring from the document (either a substring starting at
> the start of the document, or ending at the end of the document).
> I would like to be able to return the part of the document relevant to
> the "terrorism" search criteria (Under laws......help fight terrorism)
> and was wondering if this was possible, as I could not find anything
> in the Oracle 10g documentation about how to do it.
>
> I have set up a text index to index all the entries in the "News"
> table, which seems to work fine:
>
> create index news_index
> on news (document)
> indextype is ctxsys.context;
>
> I have also written a package containing the following procedure used
> for entering the search keyword (p_string), and displaying the search
> results:
>
> procedure proc1
> (p_string in varchar2 default null)
> is
>
> i number;
> op varchar2(200);
> pos number;
>
> mklob clob;
> v_amt number := 30000;
> v_doc long;
>
> begin
>
> icp.p('Search within document');
>
> icp.hr;
> icp.formopen('jason.proc1','get');
> icp.formtext('p_string',cvalue=>p_string);
> icp.formsubmit;
> icp.formclose;
>
> begin
>
> icp.hr;
> i := 0;
> for newsrec in (select name, news_id from news
> where contains (document, p_string) > 0)
>
> loop
>
> i := i + 1;
> icp.bold(newsrec.name);
> icp.br;
> pos := 1;
>
> select substr(document, 1, instr(document, ' ', pos, 20)-1) into
> op
> from news
> where contains (document, p_string) > 0;
>
> icp.p(op || ' ...');
>
> icp.br;
>
> select count(news_id) into i from news;
>
> icp.p(i || ' result(s) returned from search');
>
> end;
>
> exception
> when others then
> icp.p(sqlcode || ': ' || sqlerrm(sqlcode));
>
> end proc1;
>
> Any help would be greatly appreciated,
> Thanks,
> dugjason

Have you looked at ctx_doc.snippet()? That seems much more like what
you are asking for than ctx_doc.highlight().
The params are quite similar, so you may have problems like you're
having with Highlight below.
Keep it simple to start with -- don't call SET_KEY_TYPE; pass in only
the params that are required (index name, textkey, and query) and see
what happens.
In my case, I have a query like:
select icon, title, ctx_doc.snippet('doctext', :docid, :searchterms)
from mydocs where docid = :docid;