Every online application must have a high level of application security. The OWASP-recommended approach for preventing XSS vulnerabilities in web applications is HTML sanitization. HTML sanitization is the process of eliminating dangerous JavaScript elements from raw HTML strings. In this blog, we will discuss how to sanitize request body and dynamic URL params in Golang, assuming familiarity with Gin web framework.
Packages Required
import (
"errors"
"fmt"
"reflect"
"regexp"
"github.com/microcosm-cc/bluemonday"
)
For HTML sanitization, we used bluemonday, which could be customized according to use cases. StrictPolicy will be used to return an empty policy, effectively stripping all HTML elements and their attributes.
Sanitization Methods
Since parameters could be nested in arrays and objects, we will be using recursion for the entire parameter sanitization. We have only considered int, float, string, slice/array, map, and struct data types. Other types are not implemented yet.
func sanitizeRecursively(param interface{}) (interface{}, error) {
if param == nil {
return param, nil
}
paramValue := reflect.ValueOf(param)
switch paramValue.Kind() {
case reflect.String:
return sanitizeString(param.(string)), nil
case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64, reflect.Float32,
reflect.Float64, reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64, reflect.Bool:
return param, nil
case reflect.Slice, reflect.Array:
return sanitizeArray(param)
case reflect.Map:
return sanitizeMap(param)
case reflect.Struct:
return sanitizeStruct(param)
default:
fmt.Println("type not supported", paramValue.Kind())
}
}
String Sanitization
This method takes a string that contains an HTML fragment or document and applies the given policy allowlist. It returns an HTML string that has been sanitized by the policy or an empty string. It also removes any malicious javascript code present in that string. Here, any other logic could be added based on the use case.
var sanitizerInstance = bluemonday.StrictPolicy()
func sanitizeString(param string) string {
sanitizedHtmlStr := sanitizerInstance.Sanitize(param)
regex := regexp.MustCompile(`\bjavascript\b`)
return regex.ReplaceAllString(sanitizedHtmlStr, "")
}
Slice And Array Sanitization
Since all values in the array must be checked, we iterate over the array and call sanitizeRecursively for nested sanitizing.
func sanitizeArray(param interface{}) ([]interface{}, nil) {
paramValue := reflect.ValueOf(param)
var sanitisedArray []interface{}
for index := 0; index < paramValue.Len(); index++ {
sanitisedParam, err := sanitizeRecursively(paramValue.Index(index).Interface())
if err != nil {
return nil, err
}
sanitisedArray = append(sanitisedArray, sanitisedParam)
}
return sanitisedArray, nil
}
Map Sanitization
Similarly, for maps, all pairs must be checked and sanitized using recursive methods.
func sanitizeMap(param interface{}) (map[string]interface{}, nil) {
paramValue := reflect.ValueOf(param)
sanitisedMap := make(map[string]interface{})
for _, key := range paramValue.MapKeys() {
sanitisedParam, err := sanitizeRecursively(paramValue.MapIndex(key).Interface())
if err != nil {
return nil, err
}
sanitisedMap[key.String()] = sanitisedParam
}
return sanitisedMap, nil
}
Structure Sanitization
For structs, we check for each field and sanitize them.
func sanitizeStruct(param interface{}) (map[string]interface{}, nil) {
paramValue := reflect.ValueOf(param)
newStruct := reflect.Indirect(paramValue)
values := make([]interface{}, paramValue.NumField())
sanitisedStruct := make(map[string]interface{})
for i := 0; i < paramValue.NumField(); i++ {
fieldName := newStruct.Type().Field(i).Name
values[i], _ = sanitizeRecursively(paramValue.Field(i).Interface())
sanitisedStruct[fieldName] = values[i]
}
return sanitisedStruct, nil
}
Body And Query Params Sanitization
When using the gin web framework, we can get request params in c.Request
where c is gin’s context. But for that, first, we need to populate c.Request.PostForm
and c.Request.Form
using the ParseForm method.
ParseForm parses the raw query from the URL and updates c.Request.Form
. For POST, PUT, and PATCH requests, it also reads the request body, parses it as a form, and puts the results into both c.Request.PostForm
and c.Request.Form
.
getRequestParams function below gets request params from gin’s context.
func getRequestParams(c *gin.Context) map[string][]string {
c.Request.ParseForm()
if c.Request.Method == "POST" {
return c.Request.PostForm
} else if c.Request.Method == "GET" {
return c.Request.Form
}
return nil
}
func getSanitizedParams(c *gin.Context) {
params := getRequestParams(c)
sanitizedParams, _ := SanitizeBodyAndQuery(params)
fmt.Println(“Params - ”, params)
fmt.Println(“Sanitized Params - ”, sanitizedParams)
}
SanitizeBodyAndQuery function will then recursively sanitize all the params.
func SanitizeBodyAndQuery(params interface{}) (interface{}, error) {
sanitisedParams, _ := sanitizeRecursively(params)
return sanitisedParams, nil
}
Examples
Example 1
API Request - POST
- https://localhost:8080/api/web/users/:user_id
Input request body -
name:<title>John</title>
email:<p>john@gmail.com</p>
phone:<th>+12345678900</th>
Result -
Params - map[email:[<p>john@gmail.com</p>] phone:[<th>+12345678900</th>] name:[<title>John</title>]]
Sanitized Params - map[email:[john@gmail.com] phone:[+12345678900] name:[John]]
Example 2
API Request - GET
- https://localhost:8080/api/web/users?pinCode=<p>411043</p>
Result -
Params - map[pinCode:[<p>411043</p>]]
SanitizedParams - map[pinCode:[411043]]
From the above response, you can see that the HTML tags like p, title, th, etc. are stripped out and the data is sanitized.