Dataverse:Solr

De BrapciWiki
Ir para navegação Ir para pesquisar
useradd -m solr
su solr
cd /usr/local/solr
wget https://archive.apache.org/dist/lucene/solr/8.8.1/solr-8.8.1.tgz
tar xvzf solr-8.8.1.tgz
cd solr-8.8.1
cp -r server/solr/configsets/_default server/solr/collection1

You should already have a “dvinstall.zip” file that you downloaded from https://github.com/IQSS/dataverse/releases . Unzip it into /tmp. Then copy the files into place:

cp dvinstall/schema*.xml /usr/local/solr/solr-8.8.1/server/solr/collection1/conf cp dvinstall/solrconfig.xml /usr/local/solr/solr-8.8.1/server/solr/collection1/conf


<Set name="requestHeaderSize"><Property name="solr.jetty.request.header.size" default="102400" /></Set>

Collections

cd /home/dataverse/
cp dvinstall/schema*.xml /usr/local/solr/solr-8.11.1/server/solr/collection1/conf
cp dvinstall/solrconfig.xml /usr/local/solr/solr-8.11.1/server/solr/collection1/conf

Criar a coleção collection1 no Solr

echo "name=collection1" > /usr/local/solr/solr-8.11.1/server/solr/collection1/core.properties

File solr.service

pico /etc/systemd/system/solr.service
[Unit]
Description = Apache Solr
After = syslog.target network.target remote-fs.target nss-lookup.target
[Service]
User = solr
Type = forking
WorkingDirectory = /usr/local/solr/solr-8.8.1
ExecStart = /usr/local/solr/solr-8.8.1/bin/solr start -m 1g -j "jetty.host=127.0.0.1"
ExecStop = /usr/local/solr/solr-8.8.1/bin/solr stop
LimitNOFILE=65000
LimitNPROC=65000
Restart=on-failure
[Install]
WantedBy = multi-user.target

Você não deve rodar o Solr como root. Crie um usuario chamado Solr um diretorio no qual instalar o mesmo.

useradd solr -m
mkdir /usr/local/solr
chown solr:solr /usr/local/solr
su - solr
cd /usr/local/solr
wget https://archive.apache.org/dist/lucene/solr/8.8.1/solr-8.8.1.tgz
tar xvzf solr-8.8.1.tgz
cd solr-8.8.1
cp -r server/solr/configsets/_default server/solr/collection1

Utilizando o arquivo "dvinstall.zip" baixado na etapa de pre-requisitos. extraia ele em /tmp se ainda não o tiver feito. Então copie os arquivos nos seguintes diretórios.

cp /home/dataverse/dvinstall/schema*.xml /usr/local/solr/solr-8.8.1/server/solr/collection1/conf
cp /home/dataverse/dvinstall/solrconfig.xml /usr/local/solr/solr-8.8.1/server/solr/collection1/conf

O Dataverse requer uma mudança no jetty.xml que vem junto com o Solr. Edite e aumentando requestHeaderSize de 8192 para 102400

nano /usr/local/solr/solr-8.8.1/server/etc/jetty.xml 

<Set name="requestHeaderSize"><Property name="solr.jetty.request.header.size" default="102400" /></Set>

O Solr vai avisar sobre precisar aumentar o numero de descritores de arquivos e processos maximos em um ambiente de produção mas ainda vai rodar com os padrões. O dataverse ja aumenta esses padrões para os niveis recomentados ao adicionar a linha ulimit -n 65000 ao script de inicialização, mas para maior eficiencia, coloque o seguinte no arquivo

nano /etc/security/limits.conf
solr soft nproc 65000
solr hard nproc 65000
solr soft nofile 65000
solr hard nofile 65000

Usando o Solr como servico

cp /home/dataverse/dataverse-5.3/doc/sphinx-guides/source/_static/installation/files/etc/systemd/solr.service /etc/systemd/system/.
systemctl daemon-reload
systemctl start solr.service
systemctl enable solr.service

para que o script seja rodado e ativado durante o boot

Indexação

Full Reindex

There are two ways to perform a full reindex of the Dataverse installation search index. Starting with a “clear” ensures a completely clean index but involves downtime. Reindexing in place doesn’t involve downtime but does not ensure a completely clean index.

Clear and Reindex

Index and Database Consistency Get a list of all database objects that are missing in Solr, and Solr documents that are missing in the database:

curl http://localhost:8080/api/admin/index/status

Remove all Solr documents that are orphaned (ie not associated with objects in the database):

curl http://localhost:8080/api/admin/index/clear-orphans

Clearing Data from Solr Please note that the moment you issue this command, it will appear to end users looking at the root Dataverse installation page that all data is gone! This is because the root Dataverse installation page is powered by the search index.

curl http://localhost:8080/api/admin/index/clear

Start Async Reindex Please note that this operation may take hours depending on the amount of data in your system. This known issue is being tracked at https://github.com/IQSS/dataverse/issues/50

curl http://localhost:8080/api/admin/index

Reindex in Place An alternative to completely clearing the search index is to reindex in place.

Clear Index Timestamps

curl -X DELETE http://localhost:8080/api/admin/index/timestamps

Start or Continue Async Reindex If indexing stops, this command should pick up where it left off based on which index timestamps have been set, which is why we start by clearing these timestamps above. These timestamps are stored in the dvobject database table.

curl http://localhost:8080/api/admin/index/continue

Reindex

curl http://localhost:8080/api/admin/index/status
curl http://localhost:8080/api/admin/index/clear-orphans
curl http://localhost:8080/api/admin/index/clear
curl http://localhost:8080/api/admin/index
curl -X DELETE http://localhost:8080/api/admin/index/timestamps
curl http://localhost:8080/api/admin/index/continue

SOLR não está indexando

Recopie os arquivos

cp /home/dataverse/dvinstall/solrconfig.xml solrconfig.xml
cp /home/dataverse/dvinstall/schema.xml.xml schema.xml.xml

Verifique se existe a palavra "dataverse" dentro deles ex: Reinicie o serviço do SOLR

service solr restart

Reindexe o conteúdo

curl http://localhost:8080/api/admin/index/clear
curl http://localhost:8080/api/admin/index

Segurança

Arquivo: /usr/local/solr/solr-8.8.1/bin/solr.in.sh

SOLR_OPTS="$SOLR_OPTS -Dlog4j2.formatMsgNoLookups=true"