Path

ez.no / developer / contribs / applications / lucene (java) search plugin


Lucene (Java) search plugin

Author Version Downloads Compatible with
Paul Borgermans 0.73 3694 3.7, 3.8 (also reported to be stable on 3.6)
A Lucene based plugin for eZ publish enabling a very rich and fast search. It requires a non-standard php extension and a Java VM, so you need almost full control over your server to install. But IMHO it is worth the trouble as you get enterprise class searching in return :-)

Description

This lucene based plugin provides all the power of Lucene with eZ publish. Although it is in an beta state, it works very well in our test environment and live production sites.

FEATURES
=========

- all of lucene, see http://lucene.apache.org/java/docs/queryparsersyntax.html, including boolean searches, range searches, fuzzy searches, wildcards, ...
- full implementation of eZ publish permission system (roles/policies)
- full use of eZ publish rich object model which is mapped to lucene "docs and fields"
- possibility to search on meta data like owner name (!)
- enhanced templates showing relevance
- custom boosting of classes, attributes and even datatypes to tune the search engine to your particular setup
- very fast searching and index updates
- "more like this" functionality (/lucene/similar/<nodeid>) (!)
- full support of advanced search
- support for sorting when called with fetch function

INSTALL
======
Brief installation instructions are provided in the README file
Important: configure your site with delayed indexing

UPGRADE
========

- From 0.2 to 0.7
+ due to the changes in meta-data fields, you will have to re-index your ezp site again with the provided script in extension/lucene/bin/php
+ take a look at the templates, you may need to adapt your own search templates


FUTURE
=======

A lot more, see the summer conference talk and updates here later as well as the ROADMAP file in the distribution:

- sorting in advanced search and in template fetch (content, search) [partially implemented]
- integrating image query by example (based on image features texture, color histogram, color distribution) [not released, but first versions work locally]
- search stats (are already collected, but no interface for viewing them)
- more robustness

BUGS
=====

- please report them :-)


AUTHORS & COPYRIGHT
=====================

Paul Borgermans and Kristof Coomans, released under the GPL

Screenshot

screenshot lucene plugin in action
screenshot lucene plugin in action

Changelog

- version 0.1 intial alpha release
- version 0.2 second (beta release)
+ lots of bugfixes
+ full support of advanced search
+ lucene specific feature: "More like this" module/view and template operator
- version 0.7
+ more bugfixes and stability/robustness
+ added sorting on meta fields (not on attributes yet)
+ added possibility of specifying which meta fields are searched by default in the normal and advanced search
+ changed api calls to be conformant with Lucene 2.0 (jars are still 1.9.1)
+ removed source and doc files for apache lucene to make the download smaller
+ changed ROADMAP and numbering plan
+ put this release also on pubsvn.ez.no
- version 0.71
+ important bugfix due to wrong commit, resulting in non-working update scripts
+ take into account hidden status of nodes and corresponding ini settings
- version 0.72
+ hidden status check is now in stable branch and tar.gz archive
- version 0.73
+ added charset conversion support
+ bug fix in access query builder (thanks to Harry Oosterveen)
+ removal of PHP warnings
+ bug fix in sorting (now also works with array of depth 1)
+ added patches for 3.8 kernel files

Comments

problem in finding words without putting their emphasis

Hello to all
As part of a project ezpublish 3.9.2 I integrate the search engine Lucene, the search works well but when searching for a word without putting emphasis without any results appear. For example if I run the search word "probleme" instead of "problème" I obtained any result.

thank you in advance

Wrong libs ...

@Massimiliano

I can't remember what was wrong. I think that problem was in java libs.
After reinstalling no more seg faults.

Best regards,
Sinisa

Locks & Zero results

@Subramanian

It appears that simultaneous writes to the search index may cause the issue you describe. The latest SVN version contains a warning file that states:
IMPORTANT


 

You should configure ez publish with delayed indexing and make sure multiple updates do not run in parallel. If not, your search index may become corrupt!


Cheers
Bruce

Segfaults

@Massimiliano + @Sinisa

I was experiencing segfaults running on a site running 3.8.4 and found that they were no longer present after an upgrade to 3.8.8.

Cheers
Bruce

@Sinisa: updatesearchindexlucene.php problems myself, too

Hi Sinisa,

I don't get the segfault, but have the same problem in running the updatesearchindexlucene.php. could you please tell me exactly what libraries were messed up so that I can doublecheck if this is my case?

thanks

M.

Re: Does it work with 3.9.2?

Yes, it works fine with 3.9.2 (although I don't use the patch, I've copied the search engine plugin over to kernel/search/plugins.

it works in eZ 3.9.2?

The Lucene plugin works in eZ 3.9.2?

Thanks

HELP ME PLEASE

Hi there,
We are using the LUCENE engine for search. Often the search goes unavailable saying index unavailable and returns with zero count. Also the index creation fails.
As per our initial analysis, we found a write.lock file in the index folder and which is not allowing to create the index,
Does anyone has any idea of when this lock file is created and what for and why it is not getting deleted often.

Thanks for your valuable time.

Thanks
Subbu

Very good

Just tried lucene - it seems to work very well. Much faster than the standard search engine in eZ.
Two minor issues:
1) When trying the search I got the error
Fatal error: Undefined class name 'ezcontentobjecttreenode' in /opt/ezpublish-3.8.3/extension/lucene/search/plugins/lucene/lucene.php on line 1153
which was easily fixed by including the ezcontentobjecttreenode.php.
2) In search.tpl you need to replace {set $search_extras=...} with a {def $search_extras=...} as the variable is not set in the search.php when using templatesearch. In advancedsearch.tpl this line is missing totally.

One question: On the website for the php-javabridge I read that one can compile java classes with gcj so no VM is needed on the server. Have you ever tried that? As I use Windows on my developing machine I can't test this.

Regards

Claudia

Memory leak

Has anyone experienced a memory problem indexing a database with some content in ?

I have the case of a database I must index with about 7000 content objects, and the updatesearchindexlucene.php batch fails with a glibc error ("double free or corruption") with core dumped.

The php and the java processes are growing a lot in memory during the batch.

I experience this problem on Debian sarge and Ubuntu edgy installations.

Disable backend

Hi Thomas,

You can use --disable-backend in your ./configure command line.

To get the JavaBridge.jar, you can download php-java-bridge-X.X.X_j2ee.tar.gz. That archive contains JavaBridge.war, containing itself JavaBridge.jar in the WEB-INF/lib subdir. You copy it with your compiled PHP module, and it's OK.

Regards
Vince

Problem Building php-java

I'm running Freebsd 6.1 and I'm struggling to make php-java work but I could'nt
I've the error :
configure : error : /usr/local/bin/bash './configure.gnu' failded for server

I have before :
./configure.gnu : line 5 aclocal not found
line 6 auheader not found
line7 autoconf not found
it sugested to disable back end(which is nonsense cos it won't create .jar) or to muse the recommend autoconf, automake,libtool but I used the good ones(autoconf259,automake19,libtool1-5)

I'would be grateful if someone could help me

Fixed problem!

Hi Paul and Kristof

I have managed to fix the problem.
Somehow my 64bit and 32bit libraries on Suse were f****d! :-)
It took me a whole week to figure that out.

To Paul: Can you tell me how did you configure Apache and PHP on 64bit Suse?
(./configure --prefix= .... and stuff)

Best regards,
Sinisa

RE: Fatal error in updatesearchindexlucene.php

Sinisa

If you use the stable branch, apparently there is no valid ezp Object. This may point to a corrupt database, otherwise I cannot reproduce this here (also SuSE 64 bit, with PHP and Apache compiled as 64 bit apps, Java VM 32 bit).

--paul

Fatal error in updatesearchindexlucene.php

Hi Kristof and Paul

UPDATE:
-------------------------------------------------------------------------------------------------------------
My SLES9 server is 64bit and apache and php is compiled as 64bit progs!?
Could that be the problem?
-------------------------------------------------------------------------------------------------------------

I have the latest svn version.




php extension/lucene/bin/php/updatesearchindexlucene.php


JavaBridge log: /srv/www/htdocs/intranet/var/java.log


Starting object re-indexing


Number of objects to index: 1639


IndexDir: /srv/www/htdocs/intranet/var/corporate/lucene/main


Initializing writer ...


Looping through objects ...


Indexing Slike with node id 51 and url_alias media/slike


 

Fatal error: Call to a member function on a non-object in /srv/www/htdocs/intranet/extension/lucene/search/plugins/lucene/lucene.php on line 361


 

Fatal error: eZ publish did not finish its request


The execution of eZ publish was abruptly ended, the debug output is present below.


Segmentation Fault in 20560, waiting for debugger




My server is SLES9 with apache 2.2.3 and PHP 4.4.4
I have latest php-java-bridge 3.1.8rc2

I still have problems with lucene :-(

What to do?

Best regards,
Sinisa

Re: Fatal error

Just Fatal error: eZ publish did not finish its request, then I realized there is something like an errorlog. It says:
Fatal error: Call to a member function on a non-object in /home/irc/public_html/ezpublish/extension/lucene/lib/lire.php on line 550

I guess that helps you more...

Regards,
Harry

Re: New fatal error

Hello Harry

Can you give us the exact error you encountered?

Thanks!

Re: New fatal error

Hi,

OK, thanks Kristof, now the search works also fine in these other browsers.
However, the search for similar images still results in a Fatal Error, in any browser, but this is not a feature that we want on our sites anyway.

Regards,
Harry

Re: New fatal error

Hello Harry

You are using the trunk, right? I'll update that one in a few minutes.

Sorry for the troubles.

New fatal error

Hi,

On my site, it still creates a fatal error. Installing the new version created another fatal error: under images, there was a text: Show similar images, clicking on it resulted also in a fatal error, also after reloading the page.

Kind regards,
Harry

log in or create a benutzerkennung to comment.

Contribution info (beta)

Download