Overview
Greet implementation requires an installation of the Office extractor on a VM hosted in the customer infrastructure. This IaaS architecture allows the customer to apply their custom security layer and have enhanced protection by no input flow to the VM.
Greet Architecture

The requested IaaS architecture is composed of a Debian VM (latest version) type B2MS or equivalent (>= 2vCPU; 8 GB RAM; 30 GB SSD)
The installation is divided into 3 steps:
- Creating the VM hosting the extractor
- Creating Azure Vault and registering the application on Azure
- Deploying and Configuring the Office Extractor
To perform the installation, you must have the rights to:
- Create a virtual machine on your cloud environment and unblock the necessary outgoing flows
- Create an Azure Vault and register a new application to your Azure environment
- Grant permissions (to access the Microsoft Graph API) on the Office Extractor application
Deploying Office extractor
This part describes the steps to deploy the Office Extractor on your Azure infrastructure.
Setting up an Azure Vault
Azure Key Vault is Microsoft's vault solution for storing encryption keys, passwords or any sensitive information. You can find more information on the official website .
Extractors require an encryption key to encrypt sensitive data. Storing the encryption key in Azure Key Vault greatly reduces the risk of theft. We recommend its use although it is not mandatory.
Before you start registering the key to the Azure Key Vault service, register a new application that will be used to access it. The procedure is described here
- On the Azure Vault portal, click on "Create"

- Define the basic and billing informations and click on "Next"

- Create a new Vault access policy

- In the "Configure from a template" select "Secret management" then check the boxes corresponding to the authorizations

- Select the previously registered application and click on "Next"

- Verify the information of the Vault then click on "Create"

- On the Vault page created earlier, create a new "Secret"

- Fill in the information of the secret and clic on "Create"

Please note the following information needed later for extractor setup:
- TeantId, clientId and the value of the secret of the application that has access to the Vault (different from that of the extractor for separation of responsibilities)
- Vault Name
- Name of the secret created
Registering the application on Azure
From the home, search for “registration” and click on “application registrations”, as below:

Then click on “new registration”
Put a name and then click on the “register” button
Once registered, note the application id and the directory id, they will be used later for the office365 extractor with the secret client that we will generate in the next step:
To generate the client secret, click on “certificates & secret”, then “New client secret”, put a description and add:
Don't forget afterwards to copy this generated secret to the clipboard:
Graph permissions
Now we need to add permissions for our application, to do this click on Authorized API, then Add an authorization and finally select Microsoft graph:
Choose after app permissions:
And select the following list:
- Calendars.Read (Admin only)
- Directory.Read.All (Admin only)
- Files.Read.All (Admin only)
- Group.Read.All (Admin only)
- Mail.Read (Admin only)
- MailboxSettings.Read (Admin only)
- Reports.Read.All (Admin only)
- Sites.Read.All (Admin only)
- User.Read.All (Admin only)
- Tasks.Read.All (Admin only)
- ChannelMessage.Read.All (Admin only)
- Chat.Read.All (Admin only)
You will then have the option to grant administrator consent for these permissions:
Please note the TeantId, clientId and secret value needed later for extractor configuration.
Installing the o365 extractor
Infrastructure Prerequisites
Once your VM is created, configure your firewall to allow the following incoming/outgoing flows:
Source | Destination | Port | Protocol | Domain | Note |
Extractor | IP Mongo Greet
| 27017 | TCP/IP (TLS 1.3) | N / A | IP to communicate during deployment |
Extractor | Microsoft Graph | 443 | HTTPS | *.microsoftonline.com and *.microsoft.com | Data collection |
Extractor | SFTP Lecko | 22 | SFTP | Automatic updates and log uploads | |
Extractor | Azure Vault | 443 | HTTPS | *.vault.azure.net | Retrieve Secret from Azure Vault |
Browser | Extractor | 8080 | HTTP | dns-server:8080 | Temporary incoming stream for extractor configuration |
Please provide us your fixed public IP of the VM to authorize it on our Mongo DB. We will then provide you with the following information
- IP address of your BDD instance at Greet as well as the Credentials and certificates for TLS
- SFTP access for log upload and automatic updates
- URL of the Debian package to install on the extractor VM
For extractor setup via wizard you will need access to web interface via http://dns-server:8080/o365/setup
Generating JKS Keys for MongoDB
You have received the TLS credentials and keys ( client.pem and ca.crt ) of your MongoDB from Greet. The next step is to convert them to JKS format to allow the extractor to use them. Here are the necessary properties:
javax.net.ssl.trustStore
: the path to a trust store containing the signing authority certificatejavax.net.ssl.trustStorePassword
: the password to access this trust storejavax.net.ssl.keyStore
: the path to a keystore containing the client's SSL certificatesjavax.net.ssl.keyStorePassword
: the password to access this keystore
In practice the client.pem is converted into a KeyStore with a KeyStorePassword and the ca.crt will be directly loaded into the default truststore of the JRE in %JAVA_HOME%/lib/security/cacerts in order to include the default certificates necessary for the proper functioning of Java programs.
Change the default cacerts password
The default password for cacerts is changeit. To change it:
- Connect to the extractor server over ssh
- Retrieve your %JAVA_HOME% through the following command:
Copier
readlink -f /usr/bin/java | sed "s:bin/java::"
- Note your %JAVA_HOME% and generate a new password new_pass, then replace them in the following command:
Copier
sudo keytool -storepasswd -v -new new_pass -keystore %JAVA_HOME%/lib/security/cacerts
Convert client.pem and ca.crt keys to JKS
Now we will generate the keystore.ks from the client.pem and import the ca.crt into the cacerts of the JDK, for this make sure you have both the keys client.pem
and ca.crt
in the same folder then run the commands below:
Copier
openssl pkcs12 -export -out client.pkcs12 -in client.pem
keytool -validity 2500 -genkey -keyalg RSA -alias mongoca -keystore truststore.ks
keytool -delete -alias mongoca -keystore truststore.ks
keytool -import -v -trustcacerts -alias mongoca-ca -file ca.crt -keystore truststore.ks
keytool -genkey -keyalg RSA -alias mongoca -keystore keystore.ks
keytool -delete -alias mongoca -keystore keystore.ks
keytool -v -importkeystore -srckeystore client.pkcs12 -srcstoretype PKCS12 -destkeystore keystore.ks -deststoretype JKS
At the end of these commands you will have the keys and passwords below necessary later for the configuration of the extractor:
- the keystore.ks file configured with the KeyStorePassword
- the cacerts file as truststore configured with the new_pass as trustStorePassword
If you get a duplicate key error, you can remove the old one with a command like this:
sudo keytool -delete -alias mongo-ca -keystore %JAVA_HOME%/lib/security/cacerts
Extractor configuration wizard
1. Connect to the virtual machine using SSH
2. Download and install lecko extractor (The URL of the Debian package is to be requested from Lecko):
Copier
wget {URL_Pack_O365}
sudo apt install ./lecko-o365-extractor-jdk17.deb
3. Now please go to web page http://dns-server:8080/o365/setup from the navigator on which you have configured the access
Step 1: Access to Mongo Lecko

- Host : MongoDB database host
- Username: MongoDB database username
- Password : MongoDB database password
- Port: Default database port 27017
- Database: Default o365_extractor database (same as other extractors)
- TLS: Check this box, then fill in the fields below:

- keyStorePathFile: The absolute path to a keystore containing the client's SSL certificate
- keyStorePassword: The password to access this keystore, by default changesit
- trustStorePathFile: The path to a trust store containing the signing authority certificate
- trustStorePassword: The password to access this default trust store changesit
- sslInvalidHostNameAllowed: Check this box, because the application uses an IP address and not DNS.
Step 2: Application Azure


- Tenant : directory ID (tenant)
- ClientID : Application ID (client)
- Client Secret : Client Secrets (Certificates & Secrets)

User, Email, Calendar, Drive, Sharepoint and Teams. This is the list of bricks to be extracted. Unless otherwise indicated, you can keep everything selected.

These properties are about the extraction of the Teams part.
- Limit extraction to Teams created after date : (Optional) teams created before date will be ignored
- Teams active on the last X monthes filter : (Optional) teams not active on the last X monthes will be ignored
- Anonymize teams name : if activated, names of the teams will be encrypted
- Absolute path to the file containing teams : (Optional) absolute path to file containing a list of team ids. One line per id. If set, only teams in the list will be monitored and extracted.
Step 3: Scope de l'extraction

- Company Name : Name of your company
- Extraction start date : to ignore old documents during extraction, put a date in the format YYY-MM-DD
- Domain filter : Domain or list of domains of your company
- User filter : enter full path to a file on the extractor VM containing the user list. If active, only data of users from the list will be extracted. File is one entry per line, no comma. Trailing last empty line is optional(can't select both User filter and Azure group filter)
- Azure group filter : enter full path to a file on the extractor VM containing the azure group Id list. If active, only data of users of the listed groups will be extracted. Additionnaly, cohorts will be dynamically created from theses groups. File is one azure group id per line, no comma. Trailing last empty line is optional (can't select both User filter and Azure group filter)
- Anonymous sharepoint names : if selected, names and url of sharepoint sites will be encrypted
Step 4: Azure Vault

This step requires the configured Vault information ici.
Using the Vault is recommended but not mandatory, it is possible to skip this step by checking the box "Skip Vault configuration"
- Nom du Vault : Vault name
- Tenant : Tenant ID
- Client ID : Application ID that has access to the Vault (client)
- Client Secret : Client secret (Certificats & secrets)
- Nom du secret : Name of the secret created in the Vault
Step 5: Configuration du SFTP

The use of SFTP is recommended, but not mandatory. You can ignore this step by checking the "Ignore SFTP configuration" box.
The extractor updates itself autonomously, but configuring an automatic update with SFTP adds an extra level of security, as it ensures the authenticity of downloaded update packets.
- Server : SFTP server address
- User : SFTP account user name
- Key path : SFTP account key path saved on extractor server
- Allow logs to be sent to sftp server : Sends extraction logs to the sftp server for rapid access during maintenance operations.
Step 6: Hashing salt

- Salt for hashing : Secret used for encrypting emails and usernames, you must use the same salt configured for other extractors.
4. Restart the puller service via the following command:
sudo systemctl restart lecko-extractor
5. Go to the dashboard and inspect the extraction with the menu
Congratulations ! You have successfully configured Lecko office 365 Extractor
To check the main logs, please type the command bellow:
sudo tail -f /opt/lecko-extractor/log/extraction.log
Setting up HTTPS (optional)
To access the HTTPS installation wizard, start by generating certificates by the company's CA or creating self-signed certificates. Then do the configuration directly on the /opt/lecko-extractor/application.yml file
The extractor framework only accepts PKCS12 type private keys , if you already have PEM certificates it is possible to convert them via the following command:
openssl pkcs12 -export -in fullchain.pem -inkey privey.pem -out keyst -name tomcat - CAfile chain.pem -caname root
Alternatively, it is possible to generate a PKCS12 certificate directly with the following command:
cd /opt/lecko-extractor
keytool -genkeypair -alias selfsigned_localhost_sslserver -keyalg RSA -keysize 2048 -storetype PKCS12 -keystore keystore-ssl-key.p12 -validity 3650
You will be prompted to enter information about yourself, your company, and a password. The key generated is called keystore-ssl-key.p12
Now you have to create the configuration file /opt/lecko-extractor/application.yml after stopping the service with the following commands:
sudo systemctl stop lecko-extractor.service
sudo nano /opt/lecko-extractor/application.yml
Then add the SSL configuration as below:
isForceEnterSetup: true
security:
require-ssl: true
server:
port: 8443
ssl:
key-store: keystore-ssl-key.p12
key-store-type: PKCS12
key-store-password: {PASSWORD}
Once the configuration is saved the interface will be accessible only via https by the link https://dns-server:8443/o365/setup
Deploying Yammer extractor
Token generation
Declare the application in Yammer
- Connect to the Yammer platform and access the following URL: here as a certified administrator
A new page will appear here with all your saved apps.
- Click on the “Register new Apps / Register a new application” button
- On the pop-up that appears on the screen, enter the requested information.
- Application Name: LeckoAnalytics + [Network Name]
- Organization: Lecko
- Support email: [email protected]
- Website: http://lecko.fr
- Redirect URI: https://www.yammer.com/
Recover the token
The previous manipulation leads to a page of this type:
- Click on the link to generate the developer token
- Copy the generated key, it is used in the next step
- This token must be sent to Lecko at the address [email protected] with a copy of [email protected]
Revoke the token
You can revoke a token at any time so that it is no longer used, for this:
Go to this link: https://www.yammer.com/{network_name}/account/applications
This takes you to the page and revoke access (a message confirms)
Installing the Yammer extractor
First connect to the virtual machine using SSH.On the “Windows” platform:
- Download SSH client: PuTTY recommended:
- Open https://putty.org/
- Download putty from PuTTY download session:
Then download putty.exe (depending on your OS, either 64-bit or 32-bit. If you're not sure start with 64-bit, if it doesn't work, try the 32-bit version)
:
Run the downloaded putty.exe file and fill in the following fields:
- Hostname: example: [email protected] ( server's [email protected] )
- Connection Type: SSH
Click Open, press “Yes” if the confirmation windows regarding SSH fingerprint appear.
The following window should appear. Please type your password and press enter (be aware that nothing is displayed when you type password):
After a successful login, you should see the following:
Now run the following commands one after the other (each command should be followed by pressing the Enter key). You can copy and paste them into the PuTTY terminal window. To do this, please copy the text below and right-click on the terminal:
Request the new package url from Lecko, then run the following commands:
wget {URL_Pack_Yammer}
sudo apt install ./lecko-yammer-extractor-jdk17.deb
The Yammer extractor has a configuration interface that allows you to generate an encrypted configuration, so this mode is highly recommended.
Le Wizard est accessible via navigateur, si vous passer par internet pour configurer votre extracteur il faudrait sécuriser les échanges entre votre navigateur et l'extracteur Yammer, pour cela il faudrait installer un serveur Apache ou Ngnix (selon votre choix) ainsi que Certbot dans le but d'activer le HTTPS.

Mongo DB
- host : MongoDB database host
- username: MongoDB database username
- password : MongoDB database password
- port: Default database port 27017
- database: Default o365_extractor database (same as o365 extractor)
- withTLS: Set to true if you want to enable TLS, MongoDB configuration and JKS key generation are required, then you will have to fill in the fields below:
- keyStorePathFile: The absolute path to a keystore containing the client's SSL certificates
- keyStorePassword: The password to access this keystore, by default changesit
- trustStorePathFile: The path to a trust store containing the signing authority certificate
- trustStorePassword: The password to access this default trust store changesit
- sslInvalidHostNameAllowed: Sets whether invalid hostnames can be allowed, needed if going through tunnels or redirects
Yammer connection
- Company name: Put the name of the company
- Yammer access token : Token generated from the Yammer API
Auto-updates with SFTP
The extractor features a secure automatic update system via Azure SFTP. Its use is optional but highly recommended.
- host: sftp server address
- user: sftp account name
- keyPath: Path to sftp account login key
- clientName: Name of the client running the instance
- allowPushLog: Allow log files to be sent to the sftp server.
Security
Salt for hashing : Secret used for encrypting emails and usernames, you must use the same salt configured for office 365 and yammer extractors.
You can configure the extractor manually, without using the Wizard, by opening the configuration file:
sudo nano /opt/lecko-yammer-extractor/application.ym
Copy the template below then fill in the values:
spring:
application:
name: Yammer Extractor
date:
mongodb:
host:
username:
password:
port:
database:
authentication-database:
withTLS: false
keyStorePathFile:
keyStorePassword:
trustStorePathFile:
trustStorePassword:
sslInvalidHostNameAllowed: true
extractor:
nbDays: -2000
accessToken:
extractUsers: true
extractGroups: true
extractEvents: true
newerOlder: older_than
numberOfThreads: 10
clientName:
saltForHashing:
security:
sftp:
host:
user:
keyPath:
clientName:
allowPushLog:
- host : MongoDB database host
- username: MongoDB database username
- password : MongoDB database password
- port: Default database port 27017
- database: Default o365_extractor database (same as o365 extractor)
- withTLS: Set to true if you want to enable TLS, MongoDB configuration and JKS key generation are required, then you will have to fill in the fields below:
- keyStorePathFile: The absolute path to a keystore containing the client's SSL certificates
- keyStorePassword: The password to access this keystore, by default changesit
- trustStorePathFile: The path to a trust store containing the signing authority certificate
- trustStorePassword: The password to access this default trust store changesit
- sslInvalidHostNameAllowed: Sets whether invalid hostnames can be allowed, needed if going through tunnels or redirects
- Salt for hashing : Secret used for encrypting emails and usernames, you must use the same salt configured for office 365 and yammer extractors.
- nbDays : -2000 (Reset to -5 once the first extraction is done)
- accessToken : Token generated from the Yammer API
- clientName : Put the name of the company
- saltForHashing : Character string to encrypt user emails, use the same configured on the office 365 extractor and Teams
The extractor features a secure automatic update system via Azure SFTP. Its use is optional but highly recommended.
- host: sftp server address
- user: sftp account name
- keyPath: Path to sftp account login key
- clientName: Name of the client running the instance
- allowPushLog: Allow log files to be sent to the sftp server.
Then restart the extraction service:
sudo systemctl restart lecko-yammer-extractor.service
To check that the extractor is started, run this command line:
sudo systemctl status lecko-yammer-extractor.service
To follow the evolution of the extraction thanks to the log file of the extractor, execute this command line
sudo tail -f /opt/lecko-yammer-extractor/log/extraction.log
Congratulations ! You have successfully configured Yammer Extractor
Services maintenance
The Office 365 extractor
Two services are installed by default on the office 365 extractor package:
lecko-extractor.service
: allows you to launch the office 365 extractionlecko-extractor-updater.service
: allows you to automatically update the updates of the office 365 extractor
Checking the status of services
Both services are started automatically when the machine is started, therefore no action is necessary when restarting the server.
In addition, to know the status of the services, run the commands below:
sudo systemctl status lecko-extractor
The service should be active and the command should return a similar result:

Likewise for the second service:
sudo systemctl status lecko-extractor-update
The service should be Loaded and active for less than 10 min with a status 0/Sucess
like below:

Starting and stopping services
To stop, start or restart both services, just run the commands below respectively:
sudo systemctl stop lecko-extractor
sudo systemctl start lecko-extractor
sudo systemctl restart lecko-extractor
sudo systemctl stop lecko-extractor-updater
sudo systemctl start lecko-extractor-updater
sudo systemctl restart lecko-extractor-updater
Checking service logs
The various log files produced by the extraction services are available in the folder/opt/lecko-extractor/log
To view the list of existing logs, type the following command:
ls /opt/lecko-extractor/log
The services produce 4 log files:
- error.log: contains the error logs of the office 365 extraction
- notification.log: contains Microsoft push notification logs for call records
- extraction.log: contains all office 365 extraction logs
- update.log: contains service logs for
lecko-extractor-updater
To track extraction logs, type the command below:
tail -f /opt/lecko-extractor/log/extraction.log
Azure app secret renewal
Whether it is the Azure application of the extractor or the one that has access to the Vault, the secrets of Azure applications have an expiration date (by default 6 months), it will be necessary to anticipate their expiration at least one week in advance. when a new secret is created, it will be necessary to encrypt it then replace it directly on the application.yml file
To encrypt a value via the extractor, put it in the secretToEncrypt property of the application.yml
sudo nano /opt/lecko-extractor/application.yml
An encryption job runs every 15min to encrypt the value to generate:
- secretEncrypted value to put in security.oauth2.client.clientSecret when it comes to extractor Azure App renewal
- secretDefaultEncrypted value to put in security.vault.clientSecret when it comes to Azure App vault renewal
Finally, you will need to restart the service for the changes to take effect:
sudo systemctl restart lecko-extractor
The Yammer extractor
Two services are installed by default on the yammer extractor package:
lecko-yammer-extractor.service
: allows to launch the yammer extractionlecko-yammer-extractor-updater.service:
allows to automatically update yammer extractor updates
Checking the status of services
Both services are started automatically when the machine is started, therefore no action is necessary when restarting the server.
In addition, to know the status of the services, run the commands below:
sudo systemctl status lecko-yammer-extractor
The service should be active and the command should return a similar result:

Likewise for the second service:
sudo systemctl status lecko-yammer-extractor-updater
The service should be Loaded and active for less than 10 min with a status 0/Sucess
like below:

Starting and stopping services
To stop, start or restart both services, just run the commands below respectively:
sudo systemctl stop lecko-yammer-extractor
sudo systemctl start lecko-yammer-extractor
sudo systemctl restart lecko-yammer-extractor
sudo systemctl stop lecko-yammer-extractor-updater
sudo systemctl start lecko-yammer-extractor-updater
sudo systemctl restart lecko-yammer-extractor-updater
Checking service logs
The various log files produced by the extraction services are available in the folder/opt/lecko-yammer-extractor/log
To view the list of existing logs, type the following command:
ls /opt/lecko-yammer-extractor/log
The services produce 3 log files:
- error.log: contains yammer extraction error logs
- extraction.log: contains all yammer extraction logs
- update.log: contains service logs
lecko-yammer-extractor-updater
To track extraction logs, type the command below:
tail -f /opt/lecko-yammer-extractor/log/extraction.log
Export HAR file
The HAR file provides log data for analysing interactions between the browser and the server. This helps us to diagnose the problem.
- Open a private browser window
- Right click, choose ‘inspect’ from the menu
- Go to the Greet url
- Go to the ‘network’ tab in the window that opens on the right
- Identify yourself by clicking on the ‘Connect with Microsoft 365’ button
- Download the .har file by clicking on the ‘HAR export’ button
- Send the file to us
In addition, below is a video showing how these steps work.