Skip to content

geotrellis/vagrant.geotrellis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

vagrant.geotrellis

A vagrant environment for doing GeoTrellis development.

Vagrant and ansible must be installed.

Overview

This repository can be used to set up a virtual machine environment to develop on GeoTrellis using Vagrant. The virtual machine will include GeoTrellis, Spark, HDFS, ZooKeeper, and Accumulo.

Installation Requirements

In order to get started with this virtual machine, some software must be installed on the host machine and the host machine must support virtualization.

Software

Vagrant

Vagrant (version >= 1.7.2) is required on the host to manage the virtual machine. Binaries are available for most operating systems.

Ansible

Ansible (version >= 1.8.2) is required to handle configuration of the virtual machine. Ansible is officially supported for Mac OSX and Linux host environments, though it can be used with a Windows host machine. There are multiple ways to install Ansible, choose the most appropriate one for your operating system.

VirtualBox

VirtualBox is an open source virtual software package used to handle virtual machines. There are binaries available for most operating systems.

Note: It is also possible to run the virtual machine using a Kernel Based Virtual Machine on Linux. This can be done using the Vagrant Libvirt Provider. Vagrant Libvirt is still under active development and there are additional requirements if KVM is used.

Git

Git is used for version control. It is necessary to use Git to download the GeoTrellis code and submit patches for development.

Sytem Requirements

Your host machine should have at least 6GB of memory, a modern x86-64 processor, and virtualization support must be enabled for the processor being used.

Getting Started

  1. Clone this repository.

    git clone https://github.com/geotrellis/vagrant.geotrellis.git

    Note: If you wish to submit patches to this repository, you should consider forking this repository.

  2. Install required software listed above

  3. Fork GeoTrellis

  4. Navigate into the directory created by cloning vagrant.geotrellis

  5. Clone GeoTrellis from your forked repository.

    At this point, the directories in the vagrant.geotrellis directory should look like this.

    vagrant.geotrellis
    ├── ansible
    ├── geotrellis
    ├── README.md
    └── Vagrantfile
    
  6. Determine the appropriate folder syncing option by setting the VAGRANT_GEOTRELLIS_SYNC environment variable

    • For Linux and Mac OSX, NSF is likely the best option
    • For Windows, consider using rsync

    Rsync requires an extra process to be run to sync folders when developing, but has huge performance benefits compared to other options. This will greatly speed up compiling and running GeoTrellis since build products will not need to be synced back and forth between your guest and host machine.

Value Sync Folder Type
nfs NFS
rsync RSYNC
OS Default
  1. In the top level directory with the Vagrantfile bring up the virtual machine at the command line.

    vagrant up

    At this point Vagrant will start the virtual machine and begin provisioning it with Ansible. Depending on internet connection speeds, installation and downloading of all dependencies could take some time.

  2. Once the machine finishes provisioning, you can verify that Accumulo and HDFS are running by navigating to their web UIs.

  3. If using the RSync shared folder option, start the vagrant rsync process to ensure your changes in the GeoTrellis code get synced to the virtual machine vagrant rsync-auto

  4. Once finished, you can ssh into the machine, navigate to the GeoTrellis directory, and start hacking on GeoTrellis.

vagrant ssh
cd /home/vagrant/geotrellis/
  1. In order to run a program using geotrellis on spark you will need to create an assembly (fat jar) of the project like so:
cd geotrellis
sbt "project spark" assembly

You can use spark-submit on the vagrant machine to start a spark job:

spark-submit \
--class geotrellis.spark.ingest.AccumuloIngestCommand \
--master local[4] \
--driver-memory 1G \
--driver-library-path /usr/local/lib \
/vagrant/geotrellis/spark/target/scala-2.10/geotrellis-spark-assembly-0.10.0-SNAPSHOT.jar \
--input s3a://$AWS_ACCESS_KEY:$AWS_SECRET_KEY@geotrellis-test/nlcd-geotiff \
--instance geotrellis-accumulo-cluster --user root --password secret --zookeeper localhost \
--table tiles --layerName NLCD

Note: You may need to specify fs.s3a.access.key and fs.s3a.secret.key in hdfs-site.xml if your secret key includes /;

About

A vagrant environment for doing GeoTrellis development.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages