Table of Contents
Last updated: 2024-11-14

Full text indexing installation


The Full Text Indexing component enables search on phrases inside documents.

The Full Text Indexing component can be installed on the same server where ShareAspace Host and other ShareAspace components are installed but the recommendation is to install any External Extension component on a separate machine.

It is optional to install and configure the Full Text Indexing component for a ShareAspace Collection. The full text search for a space that is configured with full text search requires the Full Text Indexing component to be installed and configured for that collection.

Prerequisites


The Full Text Indexing component server needs to meet the following requirements:

Note

It is possible to set up the Full Text Indexing component on one machine and the Elasticsearch on another machine but it is recommended to install them both on the same machine for reasons of security.

Setup Elasticsearch

Note

The installation process is well documented at the Elasticsearch website. The setup process below is just an example how to use the default SSL certificate and username/password.

Automatic TLS certificate

By starting the Elasticsearch in console mode the first time SSL/TLS certificates are automatically created and placed in the \config\certs folder under the Elasticsearch install folder.

PS C:\elasticsearch-8.12.1\bin> .\elasticsearch.bat
Note

If there is already a TLS certificate matching the local machine name available, Elasticsearch will not generate a new one.

Note

Any certificate can be used instead of the one created by the Elasticsearch setup scripts. Usernames and passwords can also be changed. The Elasticsearch website has documentation covering this.

The certificate should be added as a trusted certificate on the elastic server, e.g. by running the Import-Certificate cmdlet in PowerShell:

Import-Certificate -FilePath "C:\elasticsearch-8.12.1\config\certs\http_ca.crt" -CertStoreLocation Cert:\LocalMachine\Root
Note

The TLS certificate used by Elasticsearch must be trusted by the machine where the ShareAspace FullTextIndexing extension is hosted.

Manual self-signed TLS certificate

Important

Elastic search comes with tooling for managing certificates, please refer to the Elasticsearch certificate utility documentation for more information. The following is an example on how to setup a self-signed TLS certificate for Elasticsearch. The example assumes that Elasticsearch is hosted on the same server as the ShareAspace FullTextIndexing extension. Elasticsearch is setup to be hosted at https://localhost:9200/.

ShareAspace full text indexing appsettings.json

{
  ...
  "ConnectionSettings": {
    "ServiceEndpoint": "https://localhost:9200/"
  },
  ...
}

To generate a self-signed Elasticsearch TLS certificate for localhost, use the elasticsearch-certutil.bat script found under the bin folder of the Elasticsearch installation.

PS C:\elasticsearch-8.12.1\bin> .\elasticsearch-certutil.bat cert --self-signed --pem --multiple

The script will ask for a series of questions press enter on all but:

  • "Enter instance name:" - type: localhost
  • "Enter DNS names for instance (comma-separated if more than one) []:" - type: localhost

A ZIP file (default certificate-bundle.zip) containing the certificate (localhost.crt) and the key (localhost.key) will be created.

Unpack the certificate and key under \config\certs.

Create a trust to the certificate.

Import-Certificate -FilePath "C:\elasticsearch-8.12.1\config\certs\http_ca.crt" -CertStoreLocation Cert:\LocalMachine\Root

Open the Elasticsearch configuration file \config\elasticsearch.yml and add the following lines:

xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.certificate: "certs\\localhost.crt"
xpack.security.http.ssl.key: "certs\\localhost.key"

Superuser password

A default password for the elastic superuser is generated automatically. It can easily be changed later by running the following command:

PS C:\elasticsearch-8.12.1\bin> .\elasticsearch-reset-password.bat -u elastic'

Ingest

The Ingest plugin is included as a module in Elasticsearch since version 8.12.1 and does no longer have to be installed as a plugin.

If Elasticsearch cannot read the external GEO IP database (this can be seen in the log) it is possible to turn off this feature.

Open the Elasticsearch configuration file \config\elasticsearch.yml and add the following lines:

ingest.geoip.downloader.enabled: false

Install as a service

Elasticsearch can be installed and hosted as a Windows service.

To install run:

PS C:\elasticsearch-8.12.1\bin> .\elasticsearch-service.bat install

Once installed the Windows service can be started using:

PS C:\elasticsearch-8.12.1\bin> .\elasticsearch-service.bat start

ShareAspace Full text indexing extension


Run the installer FullTextIndexing-elastic8-x.y.z.build.msi to install the Full Text Indexing component. The default location of the installation will be at: C:\Program Files\Eurostep\ShareAspace\FullTextIndexing. When the Full Text Indexing component is installed a new Application Pool is created with the name FullTextIndexing that will be used by the new FullTextIndexing Application in the IIS.

Important

The SSL/TLS certificate used by the Elasticsearch server must be trusted by the server hosting the full text indexing extension.

Configuration

The file appsettings.json in the installation folder of the Full Text Indexing component is the setting file for the component. The Full Text Indexing component requires a symmetric key that is used to ensure secure communication. The symmetric key is configured with the setting SymmetricKey in the NovaConfig section.

The SymmetricKey key should be generated as described in the section Generate Symmetric Signing Keys.

"NovaConfig": {
  "SymmetricKey": "OUET/sLT1l7/oEVnBGX6DvTstFydSLL0mVrStzvnn8SmK1/rahzBgjZAVmJKxrG08dlP6BNrqA9CRVCN3O4tXQ=="
}

The setting ServiceEndpoint in the ConnectionSettings section represents the address to Elasticsearch so the Full Text Indexing component knows where to send its requests. This setting should match the Elasticsearch node setting network.host, default value is https://localhost:9200/.

"ConnectionSettings": {
  "ServiceEndpoint": "https://localhost:9200"
}

It is possible to restrict the hosts that can connect to the Full Text Indexing component using the setting AllowedHosts. The default value is set to * i.e. any host. Sub domain wildcards are permitted e.g. "*.example.com" matches sub-domains like foo.example.com, but not the parent domain example.com.

The ShareAspace collection that will use the Full Text Indexing component has to configure an Nova Extension before it can be used by that collection.

Note

The Full Text Indexing component must be up and running during the registration process since a manifest exchange will take place between ShareAspace and the extension.

The IndexedCharacters configuration parameter is used for setting the maximum number of characters that should be indexed per indexed document. The default value is 100 000 characters.

{
  ...
  "IndexedCharacters": 100000
}

If the value is set to -1 the indexer will index all characters.

Warning

Indexing all characters could severely impact the performance and memory requirements of the indexing server.

The setting Secret in the Elastic section represents the secret key setup when authenticating against Elasticsearch.

The username/password is configured by passing --username and --password command line arguments to Eurostep.SAS.FullTextIndexingHost.exe:

Eurostep.SAS.FullTextIndexingHost.exe --user <USERNAME> --password <PASSWORD>
Eurostep.SAS.FullTextIndexingHost.exe --user elastic --password OLx..YskTyo=

The Eurostep.SAS.FullTextIndexingHost.exe will return a "secret" key that should be added to appsettings.json:

"Elastic": {
    "Secret": "_FULLTEXTINDEXINGSECRET_"
}

Add full text indexing extension with script

It is possible to set up the External Extension needed for the Full Text Indexing component with a PowerShell script.

Download add-fulltext-indexing-extension.ps1

function getAuthorizationHeader ($key, $path){
    $encodedPath = [Text.Encoding]::ASCII.GetBytes($path)
    $sha = New-Object System.Security.Cryptography.HMACSHA512
    $sha.key = [Convert]::FromBase64String($key)
    $hash = $sha.ComputeHash($encodedPath)
    $hashString = [Convert]::ToBase64String($hash)
    $bearerToken = $hashString.Split('=')[0]
    $bearerToken = $bearerToken.Replace('+', '-')
    $bearerToken = $bearerToken.Replace('/', '_')
    return @{"Authorization" = ("Bearer", $bearerToken -join " ")}
}

function addExternalExtension($adminKey, $externalExtensionKey, $externalExtensionUri) {
    $path = "/admin/novaExtension";
    $uri =  "https://localhost:5001" + $path
    $headers = getAuthorizationHeader $adminKey $path
    $body = @{
        tokenLifeTime = 15
        active = $true
        apiKey = $externalExtensionKey
        hostUri = $externalExtensionUri
    } | ConvertTo-Json
    Invoke-RestMethod -Method Post -Uri $uri -ContentType "application/json" -Headers $headers -Body $body
}

$adminKey = "admin-symetric-key"
$fullTextIndexingKey = "full-text-indexing-symetric-key"
$fullTextIndexingUri = "https://my.fulltextindexing.machine.com/FullTextIndexing"
addExternalExtension $adminKey $fullTextIndexingKey $fullTextIndexingUri

IIS settings

When installing the ShareAspace Full Text Indexing, an Application Pool named "FullTextIndexing" is created in IIS. By default the account running the Application Pool is the built-in account "LocalSystem".

Note

If another account has to be used for running the "FullTextIndexing" Application Pool, this account running the "FullTextIndexing" Application Pool must have read access to the Full Text Indexing installation folder.