From: Ashley Sheridan on
On Fri, 2010-06-04 at 14:54 +0300, Tanel Tammik wrote:

> "Ashley Sheridan" <ash(a)ashleysheridan.co.uk> wrote in message
> news:1275652342.2217.51.camel(a)localhost...
> > On Fri, 2010-06-04 at 14:44 +0300, Tanel Tammik wrote:
> >
> >> "Ashley Sheridan" <ash(a)ashleysheridan.co.uk> wrote in message
> >> news:1275651371.2217.46.camel(a)localhost...
> >> > On Fri, 2010-06-04 at 14:12 +0300, Tanel Tammik wrote:
> >> >
> >> >> Hello,
> >> >>
> >> >> if there is some webpage content with html tags in database is it
> >> >> possible
> >> >> to search it without tags?
> >> >>
> >> >> data : '<div style="">you need some styling!</div>'
> >> >>
> >> >> when i now search for 'you style' i don't want to get any rows! is it
> >> >> possible?
> >> >> when i search 'you styling' i get the row!
> >> >>
> >> >> Br
> >> >> Tanel
> >> >>
> >> >>
> >> >>
> >> >
> >> >
> >> > Use a second field in the DB that stores the content without any HTML
> >> > tags. That way, you can search and not worry about tags and attribute
> >> > values getting in the way.
> >> >
> >> > Thanks,
> >> > Ash
> >> > http://www.ashleysheridan.co.uk
> >> >
> >> >
> >> >
> >>
> >> Is this the only way? Couldn't i do it in mysql query? Seems much
> >> cleaner...
> >>
> >> Br,
> >> Tanel
> >>
> >>
> >>
> >
> >
> > You could try and do it in MySQL with a regex to filter out the HTML
> > tags. The regex would be real complex though, and prone to failure if
> > the HTML wasn't perfectly formed. And it would be a *lot* slower than
> > searching a plain text field. I think it's far cleaner to use a second
> > field like that.
> >
> > Thanks,
> > Ash
> > http://www.ashleysheridan.co.uk
> >
> >
> >
> OK! then i should use preg_replace before making the serch entry for DB
> storage? What would be the regular expression for that? Basically i need to
> get rid everything between the html tags with tags included?
>
> Br
> Tanel
>
>
>


No, you'd have to use a regex within MySQL, not PHP. Like I said, it
would be very complex, and I wouldn't know where to begin writing a
query that would search for specific strings and ignore any content
within the < & > without writing sub-queries.

Also, you did see that I said it would be a lot slower didn't you?
Imagine at the moment a query is taking a second to complete. With this
sort of complex regex it could take maybe 5 seconds. That's 5 seconds
per person searching.

Are you not able to make a second field in the DB?

Thanks,
Ash
http://www.ashleysheridan.co.uk


From: "Tanel Tammik" on

"Ashley Sheridan" <ash(a)ashleysheridan.co.uk> wrote in message
news:1275652880.2217.54.camel(a)localhost...
> On Fri, 2010-06-04 at 14:54 +0300, Tanel Tammik wrote:
>
>> "Ashley Sheridan" <ash(a)ashleysheridan.co.uk> wrote in message
>> news:1275652342.2217.51.camel(a)localhost...
>> > On Fri, 2010-06-04 at 14:44 +0300, Tanel Tammik wrote:
>> >
>> >> "Ashley Sheridan" <ash(a)ashleysheridan.co.uk> wrote in message
>> >> news:1275651371.2217.46.camel(a)localhost...
>> >> > On Fri, 2010-06-04 at 14:12 +0300, Tanel Tammik wrote:
>> >> >
>> >> >> Hello,
>> >> >>
>> >> >> if there is some webpage content with html tags in database is it
>> >> >> possible
>> >> >> to search it without tags?
>> >> >>
>> >> >> data : '<div style="">you need some styling!</div>'
>> >> >>
>> >> >> when i now search for 'you style' i don't want to get any rows! is
>> >> >> it
>> >> >> possible?
>> >> >> when i search 'you styling' i get the row!
>> >> >>
>> >> >> Br
>> >> >> Tanel
>> >> >>
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >> > Use a second field in the DB that stores the content without any
>> >> > HTML
>> >> > tags. That way, you can search and not worry about tags and
>> >> > attribute
>> >> > values getting in the way.
>> >> >
>> >> > Thanks,
>> >> > Ash
>> >> > http://www.ashleysheridan.co.uk
>> >> >
>> >> >
>> >> >
>> >>
>> >> Is this the only way? Couldn't i do it in mysql query? Seems much
>> >> cleaner...
>> >>
>> >> Br,
>> >> Tanel
>> >>
>> >>
>> >>
>> >
>> >
>> > You could try and do it in MySQL with a regex to filter out the HTML
>> > tags. The regex would be real complex though, and prone to failure if
>> > the HTML wasn't perfectly formed. And it would be a *lot* slower than
>> > searching a plain text field. I think it's far cleaner to use a second
>> > field like that.
>> >
>> > Thanks,
>> > Ash
>> > http://www.ashleysheridan.co.uk
>> >
>> >
>> >
>> OK! then i should use preg_replace before making the serch entry for DB
>> storage? What would be the regular expression for that? Basically i need
>> to
>> get rid everything between the html tags with tags included?
>>
>> Br
>> Tanel
>>
>>
>>
>
>
> No, you'd have to use a regex within MySQL, not PHP. Like I said, it
> would be very complex, and I wouldn't know where to begin writing a
> query that would search for specific strings and ignore any content
> within the < & > without writing sub-queries.
>
> Also, you did see that I said it would be a lot slower didn't you?
> Imagine at the moment a query is taking a second to complete. With this
> sort of complex regex it could take maybe 5 seconds. That's 5 seconds
> per person searching.
>
> Are you not able to make a second field in the DB?
>
> Thanks,
> Ash
> http://www.ashleysheridan.co.uk
>
>
>

Yes i can. You misunderestood me or i didn't express myself correctly. how
can i get rid of the tags before entering the data into the second field
created for search engine?



Br
Tanel


From: Ashley Sheridan on
On Fri, 2010-06-04 at 15:00 +0300, Tanel Tammik wrote:

> "Ashley Sheridan" <ash(a)ashleysheridan.co.uk> wrote in message
> news:1275652880.2217.54.camel(a)localhost...
> > On Fri, 2010-06-04 at 14:54 +0300, Tanel Tammik wrote:
> >
> >> "Ashley Sheridan" <ash(a)ashleysheridan.co.uk> wrote in message
> >> news:1275652342.2217.51.camel(a)localhost...
> >> > On Fri, 2010-06-04 at 14:44 +0300, Tanel Tammik wrote:
> >> >
> >> >> "Ashley Sheridan" <ash(a)ashleysheridan.co.uk> wrote in message
> >> >> news:1275651371.2217.46.camel(a)localhost...
> >> >> > On Fri, 2010-06-04 at 14:12 +0300, Tanel Tammik wrote:
> >> >> >
> >> >> >> Hello,
> >> >> >>
> >> >> >> if there is some webpage content with html tags in database is it
> >> >> >> possible
> >> >> >> to search it without tags?
> >> >> >>
> >> >> >> data : '<div style="">you need some styling!</div>'
> >> >> >>
> >> >> >> when i now search for 'you style' i don't want to get any rows! is
> >> >> >> it
> >> >> >> possible?
> >> >> >> when i search 'you styling' i get the row!
> >> >> >>
> >> >> >> Br
> >> >> >> Tanel
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >
> >> >> >
> >> >> > Use a second field in the DB that stores the content without any
> >> >> > HTML
> >> >> > tags. That way, you can search and not worry about tags and
> >> >> > attribute
> >> >> > values getting in the way.
> >> >> >
> >> >> > Thanks,
> >> >> > Ash
> >> >> > http://www.ashleysheridan.co.uk
> >> >> >
> >> >> >
> >> >> >
> >> >>
> >> >> Is this the only way? Couldn't i do it in mysql query? Seems much
> >> >> cleaner...
> >> >>
> >> >> Br,
> >> >> Tanel
> >> >>
> >> >>
> >> >>
> >> >
> >> >
> >> > You could try and do it in MySQL with a regex to filter out the HTML
> >> > tags. The regex would be real complex though, and prone to failure if
> >> > the HTML wasn't perfectly formed. And it would be a *lot* slower than
> >> > searching a plain text field. I think it's far cleaner to use a second
> >> > field like that.
> >> >
> >> > Thanks,
> >> > Ash
> >> > http://www.ashleysheridan.co.uk
> >> >
> >> >
> >> >
> >> OK! then i should use preg_replace before making the serch entry for DB
> >> storage? What would be the regular expression for that? Basically i need
> >> to
> >> get rid everything between the html tags with tags included?
> >>
> >> Br
> >> Tanel
> >>
> >>
> >>
> >
> >
> > No, you'd have to use a regex within MySQL, not PHP. Like I said, it
> > would be very complex, and I wouldn't know where to begin writing a
> > query that would search for specific strings and ignore any content
> > within the < & > without writing sub-queries.
> >
> > Also, you did see that I said it would be a lot slower didn't you?
> > Imagine at the moment a query is taking a second to complete. With this
> > sort of complex regex it could take maybe 5 seconds. That's 5 seconds
> > per person searching.
> >
> > Are you not able to make a second field in the DB?
> >
> > Thanks,
> > Ash
> > http://www.ashleysheridan.co.uk
> >
> >
> >
>
> Yes i can. You misunderestood me or i didn't express myself correctly. how
> can i get rid of the tags before entering the data into the second field
> created for search engine?
>
>
>
> Br
> Tanel
>
>
>


Ah right. Use strip_tags(). As long as the HTML is well-formed and
doesn't contain any malformed tags it should remove them correctly.

Thanks,
Ash
http://www.ashleysheridan.co.uk


From: tedd on
At 2:12 PM +0300 6/4/10, Tanel Tammik wrote:
>Hello,
>
>if there is some webpage content with html tags in database is it possible
>to search it without tags?
>
>data : '<div style="">you need some styling!</div>'
>
>when i now search for 'you style' i don't want to get any rows! is it
>possible?
>when i search 'you styling' i get the row!
>
>Br
>Tanel


Tanel:

If your database has html tags in it, then it's pretty simple to grab
the data from the db and preform strip_tags(). After which you can
search what's left.

If you want to have the db do the search, then look in to "full text"
to do the searching for you.

Cheers,

tedd

--
-------
http://sperling.com http://ancientstones.com http://earthstones.com