Showing posts with label SOLR. Show all posts
Showing posts with label SOLR. Show all posts

Thursday, May 18, 2017

Password Protected Solr Admin Page







As we all know Solr Admin Page is not password protected and anyone can get into Solr Admin Page. However this article will help you in enabling Password Protected Solr Admin Page.

Yes, Solr Admin Page comes without any password protection. However good thing is that we deploy Solr either on Tomcat or on Jetty or on JBoss. So using the feature of Web Container we can restrict the Solr Admin Page as well as Solr Indexing API call under username and password.


Step 1:
We need to add following piece of code in \solr-6.5.1\server\etc\jetty.xml

<Call name="addBean"> 
     <Arg> 
          <New class="org.eclipse.jetty.security.HashLoginService"> 
           <Set name="name">Secure Realm</Set> 
           <Set name="config">
               <SystemProperty name="jetty.home" default="."/> \\etc\\realm.properties
          </Set>
           <Set name="refreshInterval">0</Set> 
          </New> 
      </Arg> 
</Call>



Step 2:
We need to add following piece of code in \solr-6.5.1\server\solr-webapp\webapp\WEB-INF\web.xml


<security-constraint>
<auth-constraint><login-config>


<web-resource-collection>
<web-resource-name>Solr Search Engine</web-resource-name>
<url-pattern>/*</url-pattern>
</web-resource-collection>

<role-name>admin</role-name>
</auth-constraint>
</security-constraint>

<auth-method>BASIC</auth-method>
<realm-name>Secure Realm</realm-name>

</login-config>



Step 3:
Create the MD5 password using below step. For Username:admin and Password: solr123

\solr-6.5.1\bin>java -cp ..\server\lib\jetty-util-9.3.14.v20161028.jar org.eclipse.jetty.util.security.Password admin solr123

Output
2017-05-17 17:24:08.064:INFO::main: Logging initialized @958ms
solr123
OBF:1m0v1l181k8q1y7z1k5g1kxu1lxb
MD5:77cb23aec2e0ff10c2952948346d9817
CRYPT:adVVnmxPfgwZ6



Step 4:

Create a realm.proprerties in \solr-6.5.1\server\etc\ directory and add following line of code into it.

admin: MD5:77cb23aec2e0ff10c2952948346d9817, admin

i.e. <username>: MD5:<password>, <role>











Monday, May 1, 2017

Training Document - Introduction to Solr



Introduction to Solr


A brief introduction to Solr for the resources who wants to get trained on Solr. 


Table of Content
1. Introduction to Solr 
2. Solr Terminologies 
3.Installation and Configuration 
4. Configuration files schema.xml and solrconfig.xml 
5. Features of SOLR 
    a. Hit Highlighting 
    b. Auto Complete / Suggester 
    c. Stop words 
    d. Synonyms 
    e. SpellCheck 
    f. Geo Spatial Search 
    g. Result Grouping 
    h. Query Syntax 
    i. Query Boosting 
    j. Content Spotlighting / Merchandising / Banner / Elevate 
    k. Block Record / Remove URL Feature 
6. Indexing the Data 
7. Search Queries 
8. DataImportHandler - DIH 
9. Plugins to index various types of Data (XML, CSV, DB, Filesystem) 
10. Solr Client APIs 
11. Overview of SOLRJ API 
12. Running Solr on Tomcat 
13. Enabling SSL on Solr 
14. Zookeeper Configuration 
15. Solr Cloud Deployment 
16. Production Indexing Architecture 
17. Production Serving Architecture 
18. Solr Upgradation 
19. References

Saturday, May 21, 2016

[ Solr ] Error Document is missing mandatory uniqueKey field id


In the schema.xml file, It is mentioned id as required field = true.
Also the document that we try to index in SOLR do not contain this id field and hence SOLR throws this error.


Solution

  1. Either add id to all your documents
OR
  1. Remove required = true form schema file for id field.

Hope This Helps!!!!

Sunday, August 10, 2014

[SOLR] How to join fl default parameters and request parameters?



If I have set of fields, specified in fl of some SearchRequestHandler configuration in xml, then fl specified in query request parameter will override fl specified in SearchRequestHandler configuration in xml.
I want that SOLR to do Join of fl parameters from query and from SearchRequestHandler instead of overriding it.
For example:
If in query I have fl=field1,field2 and in SearchRequestHandler configuration; I have fl=field3,field1 then join of these two as fl=field1,field2,field3. 
For this issue you can use following solution:
You can use <lst name="appends"> in your requestHandler definition to make Solr append the values to the query instead of replacing them. Since fl can be added several times to the same request, this works as you're extending the list of fields to retrieve.


<requestHandler name="/select" class="solr.SearchHandler">
    <lst name="appends">
        <str name="fl">cat</str>
    </lst>
</requestHandler>



This solution will help you when you have too many fl fields that needs to defined and few of them changes as per the query.

Sunday, August 3, 2014

[SOLR] Implementing Facet with multiple categories along with their count

How can we facet on more than two categories ( ‘project’ and ‘type’ as discussed below) and at the same time get the combination facets and their count.

Example : http://search-lucene.com/

When you open URL, http://search-lucene.com/?q=facets you can see the facets on right hand side as 'Project','type','date','author' and their corresponding values with count in brackets.

For instance, let’s say you select 'solr(32857)' under 'Project' facet, still we can see other values under 'Project' facet like ElasticSearch etc. along with their respective count. 







Further when we Select 'mail # user(23138)' under “type” section , again we can see other values under “type” section with their corresponding count in brackets and their corresponding values in “Project” facet gets changed accordingly (namely the count ). 




Observe how solr(32857) changed to Solr (22969) post selection of mail # user along with the other values of ‘Project’ (like ElasticSearch etc.) and ‘type’ (issue, javadoc etc.,) with a change in their count values.

To achieve similar functionality you can use following solution:

Use Tagging and excluding Filters: http://wiki.apache.org/solr/SimpleFacetParameters#Tagging_and_excluding_Filters

select?q=solr%20facets&fq={!tag=projectTag}Project:"solr"&fq={!tag=typeTag}type:"mail # user"&facet=true&facet.field={!ex=projectTag}project&facet.field={!ex=typeTag}type&wt=json&indent=true

Using the above solution I am able to implement the faceting solution mentioned in above website using SOLR.

Hope this is useful information for your all!!!!



Sunday, July 27, 2014

Crawl your website using Nutch Crawler without Indexing the HTML content into SOLR

This article will help you in resolving following issue:

1) If you want to crawl the website using Nutch Crawler without indexing the HTML content into SOLR here are the changes that you need to perform the crawl script of nutch package.

You need to remove the following piece of code:

SOLRURL="$3"

if [ "$SOLRURL" = "" ]; then echo "Missing SOLRURL : crawl <seedDir> <crawlDir> <solrURL> <numberOfRounds>" exit -1;fi


echo "Indexing $SEGMENT on SOLR index -> $SOLRURL" $bin/nutch index -D solr.server.url=$SOLRURL $CRAWL_PATH/crawldb -linkdb $CRAWL_PATH/linkdb $CRAWL_PATH/segments/$SEGMENT if [ $? -ne 0 ]
then exit $?
fi

echo "Cleanup on SOLR index -> $SOLRURL" $bin/nutch clean -D solr.server.url=$SOLRURL $CRAWL_PATH/crawldb if [ $? -ne 0 ]
then exit $?
fi

Hope This Helps!!!

Sunday, July 13, 2014

[SOLR] RELOAD solrconfig.xml and schema.xml without restarting the SOLR

Most of the time we get answer that we need to restart the SOLR instance if we make a change in schema.xml and solrconfig.xml.

But now with SOLR4.0 onwards, this can be achieved using RELOAD command.

Command:
http://localhost:8983/solr/admin/cores?action=RELOAD&core=core0

Now, If you make changes to your solrconfig.xml or schema.xml files and you want to start using them without stopping and restarting your SOLR instance. 
Then just execute the RELOAD command on your core.

NOTE:
However there are few configuration changes which still needs, the restart of SOLR instance, 
1) IndexWriter related settings in <indexConfig>
2) Change in <dataDir> location

Hope this Helps!!!

Reference:
https://wiki.apache.org/solr/CoreAdmin#RELOAD



Sunday, June 29, 2014

Zookeeper Cluster Setup

Today we are going to create a zookeeper cluster with 5 instances.

1) Required / Installed software
    1.a) Java - jdk1.7.* (/usr/java)
    1.b) Zookeeper – zookeeper-3.4.6 (/opt/install/zookeeper)
    1.c) Port selection for zookeeper deployment

          1.c.i ) On Different Servers
Server ID
Server IP
Client Port
Quorum Port
Leader Election Port
1
10.10.10.1
2181
2888
3888
2
10.10.10.2
2181
2888
3888
3
10.10.10.3
2181
2888
3888
4
10.10.10.4
2181
2888
3888
5
10.10.10.5
2181
2888
3888


          1.c.ii ) On Same Servers
Server ID
Server IP
Client Port
Quorum Port
Leader Election Port
1
10.10.10.1
2181
2888
3888
2
10.10.10.1
2182
2889
3889
3
10.10.10.1
2183
2890
3890
4
10.10.10.1
2184
2891
3891
5
10.10.10.1
2185
2892
3892
     


2) Installation folder structure
    2.a) Zookeeper Server
                /opt/install/zookeeper/zookeeper1/
                /opt/install/zookeeper/zookeeper2/
                /opt/install/zookeeper/zookeeper3/
                /opt/install/zookeeper/zookeeper4/
                /opt/install/zookeeper/zookeeper5/

    2.b) Data Directory
                /opt/install/zookeeper/data/zookeeper1/
                /opt/install/zookeeper/data/zookeeper2/
                /opt/install/zookeeper/data/zookeeper3/
                /opt/install/zookeeper/data/zookeeper4/
                /opt/install/zookeeper/data/zookeeper5/

    2.c) Logs Directory
                /opt/install/zookeeper/logs/zookeeper1/
                /opt/install/zookeeper/logs/zookeeper2/
                /opt/install/zookeeper/logs/zookeeper3/
                /opt/install/zookeeper/logs/zookeeper4/

                /opt/install/zookeeper/logs/zookeeper5/

3) Download a Zookeeper
    3.a) Currently we are using zookeeper-3.4.6.
    3.b) From website http://zookeeper.apache.org/releases.html
    3.c) Download the latest version of zookeeper software and untar it to /tmp/zookeeper.
    3.d) The following steps need to be done for all the zookeeper instances.
    3.e) Copy /tmp/zookeeper/zookeeper-3.4.6/* to /opt/install/zookeeper/zookeeper1/
    3.f) Copy $ZOO_HOME1/conf/zoo_sample.cfg as zoo.cfg 
    3.g) Edit zoo.cfg and have the following properties

tickTime=2000                                                              
initLimit=10                                                                   
syncLimit=5                                                                   
(make sure directory is present)                                     
dataDir=/opt/install/zookeeper/data/zookeeper1/           
# the port at which the clients will connect                       
clientPort=2181                                                              
(make sure directory is present)                                     
dataLogDir=/opt/install/zookeeper/logs/zookeeper1/       
#                                                                                     
server.1=10.10.10.1:2888:3888                                       
server.2=10.10.10.2:2888:3888                                       
server.3=10.10.10.3:2888:3888                                       
server.4=10.10.10.4:2888:3888                                       
server.5=10.10.10.5:2888:3888                                       
                               
4) Create “myid” file in data directory. 
    4.a) Edit it and write only “1” on server1
    4.b) Edit it and write only  “2” on server2
  4.c) Edit it and write only “3” on server3
  4.d) Edit it and write only  “4” on server4 
4.e) Edit it and write only “5” on server5

5) Start the zookeeper. 

        ($ZOO_HOME/bin/zkServer.sh start on each of the zookeeper instance)
        $ /opt/install/zookeeper/zookeeper1/bin/zkServer.sh start
        $ /opt/install/zookeeper/zookeeper2/bin/zkServer.sh start
        $ /opt/install/zookeeper/zookeeper3/bin/zkServer.sh start
        $ /opt/install/zookeeper/zookeeper4/bin/zkServer.sh start
        $ /opt/install/zookeeper/zookeeper5/bin/zkServer.sh start

        Sample Output:
        JMX enabled by default
        Using config: /opt/install/zookeeper/zookeeper1/bin/../conf/zoo.cfg
        Starting zookeeper ... STARTED


6) To STOP the Zookeeper instances
        $ /opt/install/zookeeper/zookeeper1/bin/zkServer.sh stop
        $ /opt/install/zookeeper/zookeeper2/bin/zkServer.sh stop
        $ /opt/install/zookeeper/zookeeper3/bin/zkServer.sh stop
        $ /opt/install/zookeeper/zookeeper4/bin/zkServer.sh stop
        $ /opt/install/zookeeper/zookeeper5/bin/zkServer.sh stop


7) To Check the Zookeeper Status
        $ /opt/install/zookeeper/zookeeper1/bin/zkServer.sh status
        $ /opt/install/zookeeper/zookeeper2/bin/zkServer.sh status
        $ /opt/install/zookeeper/zookeeper3/bin/zkServer.sh status
        $ /opt/install/zookeeper/zookeeper4/bin/zkServer.sh status
        $ /opt/install/zookeeper/zookeeper5/bin/zkServer.sh status

        Sample Output
        JMX enabled by default
        Using config: /opt/install/zookeeper/zookeeper2/bin/../conf/zoo.cfg
        Mode: leader

        JMX enabled by default
        Using config: /opt/install/zookeeper/zookeeper2/bin/../conf/zoo.cfg
        Mode: follower


Password Protected Solr Admin Page

As we all know Solr Admin Page is not password protected and anyone can get into Solr Admin Page. However this article will ...