Module Nethttpd_types


module Nethttpd_types: sig .. end
Type definitions for the HTTP daemon, and an introduction

Contents




Many types can also be found in the Nethttp module (part of netstring). Furthermore, Netcgi_env and Netcgi_types are of interest (part of cgi).

Exceptions


exception Standard_response of Nethttp.http_status * Nethttp.http_header option * string option
Some HTTP containers allow you to raise this exception. The standard response corresponding to http_status is sent back to the client. If the third argument exists, an entry into the error log is written.

Environment


class type virtual v_extended_environment = object .. end
An extension of cgi_environment for use with the daemon.
class type extended_environment = object .. end
Same as v_extended_environment, but no virtual methods

Construction of environments


class virtual empty_environment : object .. end
This class implements an environment with defined internal containers.
class redirected_environment : ?in_state:Netcgi_env.input_state -> ?in_header:Nethttp.http_header -> ?properties:(string * string) list -> ?in_channel:Netchannels.in_obj_channel -> extended_environment -> extended_environment
This class overlays the input-side containers of an existing environment.

Auxiliary Functions for Environments


val output_static_response : #extended_environment ->
Nethttp.http_status -> Nethttp.http_header option -> string -> unit
Outputs the string argument as response body, together with the given status and the header (optional). Response header fields are set as follows: If the header is not passed, the header of the environment is taken. If the header argument exists, however, it overrides the header of the environment.
val output_file_response : #extended_environment ->
Nethttp.http_status ->
Nethttp.http_header option -> string -> int64 -> int64 -> unit
Outputs the contents of a file as response body, together with the given status and the header (optional). The string is the file name. The first int64 number is the position in the file where to start, and the second number is the length of the body. Response header fields are set as follows: Note that Content-Range is not set automatically, even if the file is only partially transferred.

If the header is not passed, the header of the environment is taken. If the header argument exists, however, it overrides the header of the environment.

The function raises Sys_error when the file cannot be read.

class type min_config = object .. end
Minimal configuration needed for output_std_response
val output_std_response : #min_config ->
#extended_environment ->
Nethttp.http_status -> Nethttp.http_header option -> string option -> unit
Outputs a "standard response" for the http_status. The string argument is an optional entry into the error log.

If the header is not passed, an empty header is taken. If the header argument exists, this header is taken. The header of the environment is never taken.


Service Providers

Service providers are defined using the three class types:

An implementor is free to define only one class that satisfies all three class types at once. However, this is only an option.

The three objects reflect three stages of HTTP processing. The stages have made explicit to allow the implementor of services to intercept the points in time when the processing of the next stage begins. Furthermore, in multi-threaded environments it is allowed that the stages are performed in the contexts of different threads.

In addition to the three-stage model there also several faster paths of processing:


exception Redirect_request of string * Nethttp.http_header
The "early" redirect is only allowed in stage 1 of HTTP processing. The string argument is the new URI path of the request. The header can also be exchanged except the fields that are needed to decode the request body. It is not possible to change the method.
exception Redirect_response of string * Nethttp.http_header
The "late" redirect is only allowed in stage 3 of HTTP processing. The string argument is the new URI path of the request. The header can also be exchanged except the fields that are needed to decode the request body. The method is always changed to GET.
class type http_service_generator = object .. end
class type http_service_receiver = object .. end
type http_service_reaction = [ `Accept_body of http_service_receiver
| `File of
Nethttp.http_status * Nethttp.http_header option * string * int64 * int64
| `Reject_body of http_service_generator
| `Static of Nethttp.http_status * Nethttp.http_header option * string
| `Std_response of
Nethttp.http_status * Nethttp.http_header option * string option ]
Indicates the immediate reaction upon an arriving HTTP header:
class type ['a] http_service = object .. end

Helpers


val update_alist : ('a * 'b) list -> ('a * 'b) list -> ('a * 'b) list
update_alist updl l: Returns the alist with all elements of updl and all elements of l that are not member of updl.

Overview over the HTTP daemon

This library implements an HTTP 1.1 server. Because it is a library and not a stand-alone server like Apache, it can be used in very flexible ways. The disadvantage is that the user of the library must do more to get a running program than just configuring the daemon.

The daemon has five modules:

It is also important to mention what Nethttpd does not include:

It is hoped to add this functionality later in a generic way (i.e. not only for Nethttpd).

Suggested strategy

First, look at Nethttpd_services. This module allows the user to define the services of the web server. For example, the following code defines a single host with an URL space:

let fs_spec =
  { file_docroot = "/data/docroot";
    file_uri = "/";
    file_suffix_types = [ "txt", "text/plain";
                          "html", "text/html" ];
    file_default_type = "application/octet-stream";
    file_options = [ `Enable_gzip;
                     `Enable_listings simple_listing
                   ]
  }

let srv =
  host_distributor
    [ default_host ~pref_name:"localhost" ~pref_port:8765 (),
      uri_distributor
        [ "*", (options_service());
          "/files", (file_service fs_spec);
          "/service", (dynamic_service
                           { dyn_handler = process_request;
                             dyn_activation = std_activation `Std_activation_buffered;
                             dyn_uri = Some "/service";
                             dyn_translator = file_translator fs_spec;
                             dyn_accept_all_conditionals = false
                           })
        ]
    ]

The /files path is bound to a static service, i.e. the files found in the directory /data/docroot can be accessed over the web. The record fs_spec configures the static service.

The /service path is bound to a dynamic service, i.e. the requests are processed by the user-defined function process_request. This function is very similar to the request processors used in Netcgi.

The symbolic * path is only bound for the OPTIONS method. This is recommended, because clients can use this method to find out the capabilities of the server.

Second, select an encapsulation. As mentioned, the reactor is much simpler to use, but you must take a multi-threaded approach to serve multiple connections simultaneously. The engine is more efficient, but may use more memory (unless it is only used for static pages).

Third, write the code to create the socket and to accept connections. For the reactor, you should do this in a multi-threaded way (but multi-processing is also possible). For the engine, you should do this in an event-based way.

Now, just call Nethttpd_reactor.process_connection or Nethttpd_engine.process_connection, and pass the socket descriptor as argument. These functions do all the rest.

The Ocamlnet source tarball includes examples for several approaches. Especially look at file_reactor.ml, file_mt_reactor.ml, and file_engine.ml.

Configuration

One of the remaining questions is: How to set all these configuration options.

The user configures the daemon by passing a configuration object. This object has a number of methods that usually return constants, but there are also a few functions, e.g.

  let config : http_reactor_config =
    object
      method config_timeout_next_request = 15.0
      method config_timeout = 300.0
      method config_reactor_synch = `Write
      method config_cgi = Netcgi_env.default_config
      method config_error_response n = "<html>Error " ^ string_of_int n ^ "</html>"
      method config_log_error _ _ _ _ msg =
        printf "Error log: %s\n" msg
      method config_max_reqline_length = 256
      method config_max_header_length = 32768
      method config_max_trailer_length = 32768
      method config_limit_pipeline_length = 5
      method config_limit_pipeline_size = 250000
    end 

Some of the options are interpreted by the encapsulation, and some by the kernel. The object approach has been taken, because it can be arranged that the layers of the daemon correspond to a hierarchy of class types.

The options are documented in the modules where the class types are defined. Some of them are difficult to understand. In doubt, it is recommended to just copy the values found in the examples, because these are quite reasonable for typical usage scenarios.