Survey Slate | Technical Overview

Documentation for getting started with your own survey application & orientation to application components.

Source code

Core Application Notebooks

Three notebooks comprise the main entry points for the different types of system users:

Notebook Audience Purpose
Filler End Users / Those Answering Questions Provides a view of the survey to be completed. Access is to survey fillers by way of authorized links.
Designer Survey Creators / Those asking questions Provides an interface to create and edit survey content.
Admin Technical Owners / IT Administrators Provides utilities for survey & user management, including the definition of permissions/access controls and a means of deploying created surveys to CloudFront.

Each application notebook can be accessed from the Observablehq domain. Filler and Designer are also deployed into an S3 bucket for self-hosted usage.

Operational Notebooks

Notebook purpose
Manual User guide to using the system
Technical documentation Infrastructure information, this document!
Credentials Borrower credential management
Analysis Template Analysis of responses

Application Customizations & Extension Notebooks

To extend and adapt application functionality and customize application services programmatically, we utilize additional notebooks:

Notebook purpose
Configuration Survey Slate installation configuration
Components Survey component library
GESI Styling Page layout and CSS for GESI surveys
GESI Components Branded Components for all applications
Survey Slate Styling Page layout and CSS for Unbranded surveys
Survey Slate Components Unbranded Components for all applications

Ecosystem Helper Notebooks

Survey Slate is built using other resources contributed and shared openly by our team members and others. These open source notebooks are found within the Observable ecosystem. We'd like to highlight some of the bigger contributions:

Notebook Purpose
AWS Helpers Helpers for accessing resources hosted on AWS based on the AWS SDK for JavaScript.
Deploy Notebook to S3 An architecture based on the AWS Helpers to help more easily deploys notebook to S3 (and invalidating CloudFront cache).
Notebook Secret A means of encrypting notebook secrets.
View literal A helper function (and corresponding explanatory notes) intended to ease the burden of building complex UIs.
UI development A tutorial on building complex UIs
Jeremy Ashkenas' Inputs A set of UI components first published in 2018 and initially the 'go-to' means of adding interactive inputs to Observable notebooks.
Observable's Inputs Another set of UI components, released in 2021 and intended to help ensure greater inputs accessibility (for both humans and machines).

Cloud Hosting Architecture

Survey Slate uses AWS application hosting, data storage and user management. Service components are as follows:

AWS Service purpose
S3 Cloud file storage. Separate buckets are used for surveys, responses and configuration.
IAM User management and access policies.
Cloud Front Content delivery (for application serving).
Certificate Manager Certificate Manager (note US-EAST-1 Region).

The design concept aims for Cloud simplicity:

Static configuration for AWS resource layout is in the configuration notebook.

S3 structure

For security and ease of comprehension, data are divided across three S3 buckets. Each bucket is prefixed by its security theme.

Bucket name prefix purpose
Contains end user data (survey responses).
Contains private org data (survey content and settings).
Contains externally accessible data (application software and encrypted end user credentials).

Within each bucket are several for-purposes directories, as described below.

External customer data is confidential, and ringfenced within a dedicated bucket, organized as follows:

path purpose
/accounts/<account_id>/settings.json Per-account data.
/accounts/<account_id>/survey/<survey_id>/answers_<timestamp>.json Survey answers.

Each directory folder under /accounts/survey/ represents a specific survey (set of questions), with an unlimited number of surveys being possible for each account. When an authorized account respondent (survey filler) responds or updates a survey, answers are stored in a time-stamped file. The per-account data file contains a pointer to the latest set of answers.

Private Bucket

The private bucket is for internal organizational data

path purpose
/surveys/<survey_id>/settings.json Survey metadata
/surveys/<survey_id>/layout_<timestamp>.json Survey layout
/surveys/<survey_id>/questions_<timestamp>.json Survey questions
/surveys/<survey_id>/version_<timestamp>.json Survey versions
/passwords/<username hash> Password for decrypting credentials

The public bucket is for access on the open internet

path purpose
/apps/designer/index.html Exported Designer notebook entrance point
/apps/survey-staging/index.html Exported Filler notebook for testing
/apps/survey/index.html Exported Filler notebook entrance point
/credentials/<username hash> Encrypted IAM credentials for AWS access

SyntaxError: invalid expression

Buckets contain sensitive data so access is gaurded with an IAM Policy. We achieve Attribute Based Access Control by extensive use of Resource Tags.

Organization staff and end users are grouped together under one IAM user group called "User". However, what particular objects in s3 can be access depends on

Thus, a single IAM user may be a designer for one survey, and a survey filler for another.

Managment of tags, and therefore access user control, is done through the Admin notebook.

IAM User Groups

We use two IAM user groups

Principle Tags

IAM users who are members of the group "Users" are tagged to represent data access

Principle tag key value
designer List of surveys the user is authorized to modify
account List of organizations the user is part of
filler List of surveys the user can respond to

We make extensive use of embedded lists in values, using " " to delimited entries.*

* In retrospect, we would use "|"—including on the start and end of the value—to simplify the AWS policy expression.

S3 Object Tags

Objects in S3 are tagged to denote who has read/write access to them.

Bucket Resource tag key value
private survey The survey this object belongs to
confidential account The organization this object belongs to

IAM User-Resource Policy

A single IAM policy is attached to the 'Users' group which enforces constraints such as

The main trick to the IAM policy is using StringLike condition which will perform substring matching, which is the semantic we need to perform "containing" logic between a SCALAR and a LIST. See the following example. Note * is wildcard matching.

            "Effect": "Allow",
            "Action": "s3:GetObject",
                    // Rule applies to access in the private bucket
            "Resource": "arn:aws:s3:::private/*", 
            "Condition": {
                "StringLike": {
                    // look for "project" SCALAR inside 'designer' LIST  
                    "aws:PrincipalTag/designer": "*\${s3:ExistingObjectTag/project}*"
                }
            }

Survey Slate S3 Data Model

We can give users user-tags, and flag which user-tags have access to files using matching file-tags. We do not want user's perusing our S3 buckets, so we ban all users from using the LIST operation. So users can only PUT or GET files in our buckets, and those operations are gated by tags. In order for these operations to work, the user must discover the specific file paths somehow.

This means users can only find files if the path is either

  1. deterministic from the knowledge they already have, for example <username>/settings.json
  2. written down in a known file location, for example, each version of a survey uses a timestamp in its filename. The client cannot guess the timestamp, so all the timestamp filenames are listed in the settings.json. So a client must first read the settings, and then it knows about all the other related files.

A consequence is this: if you wish to manually change the state of the system, you can add/delete/move files, but you must also remember to add access file-tags (otherwise nobody can read them) and change the relevant file that references them (e.g. the change nearest settings.json).

IAM User Capability Self-Discovery

Users are granted permission to read their own tags so they can discover what they have access too.

            "Effect": "Allow",
            "Action": [
                "iam:GetUser",
                "iam:ListUserTags"
            ],
            "Resource": "arn:aws:iam::032151534090:user/\${aws:username}"

Reference Policy

const policy = ({
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "UserListOwnTags",
            "Effect": "Allow",
            "Action": [
                "iam:GetUser",
                "iam:ListUserTags"
            ],
            "Resource": "arn:aws:iam::032151534090:user/${aws:username}"
        },
        {
            "Sid": "DesignerReadSurvey",
            "Effect": "Allow",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::private-mjgvubdpwmdipjsn/*",
            "Condition": {
                "StringLike": {
                    "aws:PrincipalTag/designer": "*${s3:ExistingObjectTag/survey}*"
                }
            }
        },
        {
            "Sid": "DesignerWriteSurvey",
            "Effect": "Allow",
            "Action": [
                "s3:putObjectTagging",
                "s3:PutObject"
            ],
            "Resource": "arn:aws:s3:::private-mjgvubdpwmdipjsn/*",
            "Condition": {
                "StringLike": {
                    "aws:PrincipalTag/designer": "*${s3:RequestObjectTag/survey}*"
                }
            }
        },
        {
            "Sid": "FillerReadAccount",
            "Effect": "Allow",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::confidential-bspqugxjstgxwjnt/*",
            "Condition": {
                "StringLike": {
                    "aws:PrincipalTag/account": "*${s3:ExistingObjectTag/account}*"
                }
            }
        },
        {
            "Sid": "FillerWriteAccount",
            "Effect": "Allow",
            "Action": [
                "s3:putObjectTagging",
                "s3:PutObject"
            ],
            "Resource": "arn:aws:s3:::confidential-bspqugxjstgxwjnt/*",
            "Condition": {
                "StringLike": {
                    "aws:PrincipalTag/account": "*${s3:RequestObjectTag/account}*"
                }
            }
        },
        {
            "Sid": "FillerReadSurvey",
            "Effect": "Allow",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::private-mjgvubdpwmdipjsn/*",
            "Condition": {
                "StringLike": {
                    "aws:PrincipalTag/filler": "*${s3:ExistingObjectTag/survey}*"
                }
            }
        }
    ]
})

Internal and External Access Keys

There are two key types in Survey Slate, internal and external. Firstly, all users are ultimately represented as IAM users and login with IAM credentials. With Internal keys, the IAM credential is never stored anywhere, and it is up to the administrator to distribute it to the person it belongs to. The internal key is meant for organization staff members.

External keys are meant people outside the organization. The keys are encrypted and stored in the PUBLIC s3 bucket. Nobody can decrypt the credentials without the password. The password is stored in the PRIVATE bucket. Users are given the password which allows them to decrypt their AWS IAM access keys and access resources on AWS according to the IAM policy for those credentials.

Here we outline the provisioning of end-user credentials that lead to a link that can complete surveys

The user can be sent the magic link via email, which takes them directly to the survey filler app. Then

At this point, the browser will be in possession of IAM access keys and can issue authenticated requests to the AWS SDK. The first step will be querying for IAM tags, which will tell the user what surveys and accounts they have access to. From there they can read the relevant **/settings.json for survey/account metadata.

⚠️ You do not need to include the username and password in the link! If these are omitted a login form is presented. This can feel more secure than including them in the URL

CloudFront Application Hosting

The three applications are implemented as Notebooks for ease of development. For deployment, the notebooks are exported as a tarball, and placed in S3. AWS's CDN CloudFront then severs the content over HTTPS from the S3 bucket.

The notebook deploy notebook to s3 is used to provide a top level HTML framing, over Observable's export notebook tarball.

Note your browser often caches the CloudFront output, so after a deploy, try a hard refresh! If something does not seem to be on the main URL, try going to the s3 bucket directly.

Staging environment for Filler code changes

The filler app often requires code changes e.g. new survey components. As the app is shared across all surveys, it is dangerous to deploy new code without testing first, as a bad code change could prevent people outside the organization from accessing their surveys or worse, losing their data!

Thus we have a staging application that you can test first:

https://www.surveyslate.org/survey-staging/index.html

This has access to all production data so it is very close to the real thing.

Installing on your own AWS account

Our notebooks are attached to a private AWS account with no public access. This section will explain how to move these applications onto your own AWS infrastructure. We have no spent much engineering time on this story yet, but we welcome external contributions to streamline it.

Provision AWS resources

You need three buckets, a "user" IAM group with an attached "user" IAM policy.

You will need to update the IAM policy to reference your AWS account number and your chosen bucket names.

CloudFront is not so important and can be done later.

Fork notebooks and update dependencies

Fork the core and support notebooks. Ensure the forked core notebook dependencies point to the new support notebooks.

Set AWS configuration

Update the configuration in the configuration notebook.

Create a Admin access key

Create access keys for the person who should be able to create users. Those credentials will be be used with the admin app.

Create a test survey

Using the admin app and the admin credentials.

Provision a test user with internal and external access

Provision a test accounts

Grant the test user design and filler permissions to your test project and test account.

First select your test user

Then make them a designer

Grant them filler access

Grant them access to the test organization

Test your user

You should be able to use the internal access key to use the forked Designer app. You should be able to use the external access link to use the forked Filler app. You can tell if the access keys are working if you see the account/project/survey drop downs get populated.

⚠️ Check the Console logs if you encounter problems (option + command + i usually Chrome)


--