www.openlinksw.com
docs.openlinksw.com

Book Home

Contents
Preface

Virtuoso Functions Guide

Administration
Aggregate Functions
Array Manipulation
BPEL APIs
Backup
Compression
Cursor
Date & Time Manipulation
Debug
Dictionary Manipulation
Encoding & Decoding
File Manipulation
Free Text
Hashing / Cryptographic
LDAP
Locale
Mail
mime_body
mime_part
mime_tree
nntp_auth_get
nntp_auth_post
nntp_get
nntp_post
pem_certificates_to_...
pop3_get
smime_decrypt
smime_encrypt
smime_sign
smime_verify
smtp_send
uuvalidate
Miscellaneous
Number
Phrases
RDF data
Remote SQL Data Source
Replication
SOAP
SQL
String
Transaction
Type Mapping
UDDI
User Defined Types & The CLR
VAD
Virtuoso Java PL API
Virtuoso Server Extension Interface (VSEI)
Web & Internet
XML
XPATH & XQUERY

Functions Index

mime_tree

parses MIME messages into an array structure
array mime_tree (in message_text string, [in flag integer ]);
Description

This function is intended to parse MIME (RFC2045) messages (coming from a RFC822 or HTTP sources). It parses the text and produces an array structure representing the structure of the MIME message. It copies into the structure MIME headers, but for the MIME bodies it only stores start and end offsets, thus optimizing space usage.

The parameters to mime_tree are:

If flag is 1, the "root" message follows RFC822. This means mime_tree will unfold the attributes, will scan for MIME registered header fields and will take their attributes. Alternately this can be a MIME message which needs no unfolding and has attributes separated with semicolon.

If flag is 2, the "root" message follows RFC2045. This means mime_tree will scan for MIME attributes.

In either cases mime_tree will look for the Content-Type header field and will parse the "message/rfc822" and "multipart/digest" MIME bodies as nested messages.

mime_tree will return an array of 3 elements (message descriptor) with the following structure:

Examples

consider the following message text

Form: Somebody <someuser@somehost>
Mime-Version: 1.0
Content-Type: "multipart/mixed";
	boundary="--the boundary"
To: self@localhost

This is a multipart MIME message
----the boundary
Content-Type: image/gif; filename="the_big_picture.gif"

GIF........
----the boundary
Content-Type: message/rfc822

From: Ford@Perfect
To: vogon
Mime-Version: 1.0
Content-Type: multipart/alternate; boundary="--sub-boundary"

This is some Message
----sub-boundary
Content-Type: text/plain

Hi
----sub-boundary
Content-Type: text/html

<P>Hi</P>
----sub-boundary--
Some garbage
----the boundary
Content-Type: text/plain

Some additional text
----the boundary--
Some additional garbage

MIME_TREE(the_text, 1) will produce:

--- the main message start
(
 ("From", "Somebody <someuser@somehost>",
      "Mime-Version", "1.0", "Content-Type",
      "multipart/mixed",
      "boundary", "--the boundary",
      "To", "self@localhost"), 		--- main attributes
 (n1, n2, 0, (mg1, mg2)), 		--- main message body
						("This is a multipart MIME message")
 ( 					--- Sub-parts array start
  ( 					--- Sub-Part 1
   ("Content-Type", "image/gif",
        "filename",
        "the_big_picture.gif"), 	--- Attributes
   (s2, e2, 0, 0), 			--- body
   0 					--- no sub parts of the GIF
  ),
  (					--- Sub-Part 2
   ("Content-Type", "message/rfc822"),	--- Attributes
   (s3, e3, 				--- the body offsets
    (					--- the body is recognized as a message,
						so parse it
     ("From", "Ford@Perfect", "To", "vogon",
          "Mime-Version",
          "multipart/alternate",
          "boundary",
          "--sub-boundary"),		--- The body's Attributes
     (ss1, se1, 0, (g2, ge2)),		--- the body's body ("This is some message").
						The message has the text
						"Some additional garbage"
						marked by g2, ge2 offsets
     (					--- body's parts
      (					--- body's SubPart 1
       ("Content-Type", "text/plain"),	--- attributes
       (ss2, se2, 0, 0),		--- the text "Hi"
       0				--- no subparts
      ),
      ( 				--- body's SubPart 2
       ("Content-Type", "text/html"),	--- attributes
       (ss3, se3, 0, 0),		--- the HTML paragraph "Hi"
       0				--- no subparts
      )
     )
    ),					--- end of the body's structure
    0					--- no trailers
   ),
   0					--- no subparts
  ),
  (					--- SubPart 3
   ("Content-Type", "text/plain"),	--- attributes
   (s4, e4, 0, (g1, ge1)),		--- the text "Some additional text"
						and "Some additional garbage"
   0					--- no subparts
  )
 )					--- end of subparts array of the main message
)

where the n1, n2, mg1, mg2, s2, e2, s3, e3, ss1, se1, g2, ge2, ss2, se2, ss3, se3, s4, e4, g1, ge1 are offsets, denoting starts and ends of the appropriate pieces within the source message, which can be used by the subseq function:

subseq (the_text, g1, ge1) returns the string "Some additional garbage"