***********************************************************************
    IBM TEXT-TO-SPEECH TTS RUN TIME KIT 
    Version 6.4.0.3
    Readme (win32.readme.6.4.0.3.txt)
    Copyright IBM Corporation, 2002.  All Rights Reserved 
***********************************************************************


CONTENTS
--------
 1.  Company
 2.  Product
 3.  Version 
 4.  Description
 5.  Contact Information
 6.  Upgrade Information
 7.  What's New
 8.  Installation Requirements
 9.  End-User Installation Instructions
10.  ISV Installation Instructions
11.  Working with Concatenative Voices
12.  Uninstall Instructions
13.  General Limitations and Comments 
14.  Known Problems & F.A.Q.
15.  Developer Notes
16.  Memory and Performance Tools
17.  Logging Utilities
18.  Trademark Information




1.  COMPANY
-----------
    International Business Machines Corporation (IBM)


2.  PRODUCT
-----------
    IBM Text-to-Speech Run Time Kit


3.  VERSION
-----------
    IBM Text-to-Speech TTS Run Time Kit, Version 6.4.0.3 


4.  DESCRIPTION 
----------------
IBM Text-to-Speech Run Time Kit, provides the speech synthesis engine and 
components necessary for applications to produce speech. IBM 
Text-to-Speech Run Time Kit, Version 6.4.0.3 produces speech from 
recordings of units of human speech. These units (possibly phonemes, 
syllables, words, or phrases) are then combined (concatenated) 
according to linguistic rules formulated from analyzed text. When these 
recorded speech units are entire phrases or sentences, the output can 
be very natural, human-sounding speech.

The components for the Text-to-Speech Run Time Kit include: 
Speech synthesis engine
Data Sets (Per Language):
Voice 1 Adult male 8 KHz 
Voice 2 Adult female 8 KHz
Voice 4 Adult male 8 KHz for U.S English Only

The Speech synthesis engine and data include capability for a 
concatenative voice dataset representation as well as for synthesized 
voice representation. The concatenative voice is derived from a 
professional speaker, speaking a particular language and dialect, 
recorded at a particular sampling rate. When a client program changes 
languages, and it is doing concatenative synthesis, a new voice dataset 
may have to be loaded into memory from disk, if it is not already 
cached in memory from previous usage. 

The system will automatically choose concatenative synthesis if a voice 
data set is available for the language, voice, and sample rate that you 
select. For example, if you are using English at 8KHz, with voice 1 and 
U.S. English voice 1 at 8Khz has been installed, then the system will 
automatically do concatenative synthesis. Otherwise, the system will do 
formant synthesis. 

When concatenation is being done, ECI voice selections appear to the 
concatenative engine as requests to switch between already-loaded voice 
datasets, while voice attribute settings appear as changes in the 
phonetic and acoustic data that it receives. 


5.  CONTACT INFORMATION
-----------------------
Please visit our Web site for enhancements and updates to Text-to-Speech.

    http://www.software.ibm.com/speech/dev


6.  UPGRADE PATH TO FULL VERSION
--------------------------------
The full version is currently included.


7.  WHAT'S NEW
--------------
This version of Text-to-Speech includes support for custom filters.  An 
e-mail filter is provided that will convert e-mail messages into a more 
natural format.Please refer to the Text-to-Speech SDK for more 
information on implementing and using custom filters.


8.  INSTALLATION REQUIREMENTS
-----------------------------
Hardware: 
Formant
- Processor performance equivalent to Intel Pentium 133MHz with MMX 
  with 256K L2 cache
- 48MB of RAM in total
- 10MB available hard disk space
- Compatible 16 bit sound card 
- CD-ROM drive 
Note: Formant functionality is supported under:
      Windows 98
      Windows 2000
      Windows NT 4.0
      Windows Millennium
      Windows XP


Concatenative
- Processor performance equivalent to Intel Pentium III 266MHz
- 48MB of RAM plus 150MB of RAM per Concatenative Voice loaded
- 10MB available hard disk space + 150 MB Per Concatenative Voice,
  except Chinese which requires 300 MB for each Concatenative Voice.
- Compatible 16 bit sound card 
- CD-ROM drive 
Note: Concatenative functionality is only supported under:
      Windows 2000 with Service Pack 1 
      Windows NT 4.0 with Service Pack 6
      Windows XP


9.  END-USER INSTALLATION INSTRUCTIONS
--------------------------------------
Run setup.exe from the installation media.
Follow the instructions presented to you.
You may be prompted to install concatenative voices.
Select the voices to be used with concatenative voice synthesis.


10.  ISV INSTALLATION INSTRUCTIONS
----------------------------------
If you are deploying applications using the IBM Text-to-Speech Run Time 
Kit, you must obtain a licence from IBM for redistribution.
In addition, you will want to integrate our product installation with 
your product's installation program. You will need to copy the 
redistributable TTS driver to your installation media and invoke 
setup.exe.
The IBM Text-to-Speech Run Time Kit installation program setup.exe, takes 
the following command line arguments: 

setup.exe [installPath] [/silent] [/hideaddremove] [/nr] [/ns] 
[/nl] [/nk] [-SMS] [/statusnone] [/statusold] [/concatall] 
[/concatnone] -lXXXX 


-l (Lower Case L) requires the the following XXXX language code

0003-Catalan   0005-Czech      0006-Danish    0007-German
0008-Greek     0009-English    000a-Spanish   000b-Finnish
000e-Hungarian 0010-Italian    0011-Japanese  0012-Korean
0013-Dutch     0014-Norwegian  0015-Polish    0019-Russian
001a-Croatian  001b-Slovak     001d-Swedish   001e-Thai
001f-Turkish   0021-Indonesian 0024-Slovenian 002d-Basque
0404-Chinese (Taiwan) 040c-French (Standard)0416-Portuguese (Brazilian)
0804-Chinese (PRC) 0816-Portuguese (Standard) 0c0c-French (Canadian)

**Note due to an InstallShield limitation, if you are using DoInstall 
you must 
specify the same language as the parent installation.  See IS document 
Q144122.

<Installpath> can contain spaces and is a fully qualified path.  No 
quotes 
should be placed around the path.  Path will be ignored if TTS is 
already on the system.  If a path is provided on the command line, the 
choose directory dialog will not be shown.

/silent
Prevent everything except the path dialog from appearing.  If voice 
data is detected it too will ask which voices to install regardless of 
this parameter.

/hideaddremove
Deletes the Add/Remove program entry from the control panel.

/nr 
No reboot message and subsequent reboot. If a calling application 
executes our install with a GUI, the calling install may perform 
additional logic.  The calling install should then reboot if TTS 
requests.  Please see appendix 2 for how to determine whether TTS 
requires a reboot.  TTS functionality will not work until the requested 
reboot is carried out.  If the /silent option is used /nr is redundant.

[-SMS] 
This switch prevents a network connection and Setup.exe from closing 
before the installation is complete. The switch works with 
installations originating from a Windows NT server over a network. 
Please note that SMS must be uppercase.
This switch is case-sensitive. 

/statusold
By default, the TTS install will show a large progress bar dialog box.
To display the small dialog box, use the /statusold option.

/statusnone
To turn off the status box altogether, use the option /statusnone.

/concatall
Install all concatenative voices. Check return codes for out of space.

/concatnone
To not install any of the concatenative voices.


[Redundant but still supported for backwards compatibility]
/nk do not hide add remove (now default behavior)
/nl no license (no license now packaged).
/ns (silent install)

*Please note the language parameter is not optional. A minimal amount 
of change is required to make old installations work.


11.  Working with Concatenative Voices
--------------------------------------
During installation you may install concatenative voices from the 
selection presented to you.  Due to disk space issues or for periodic 
updates, you may wish to add, remove, or relocate a concatenative voice.
To add a voice, rerun the installation selecting the voice you wish to 
add.To remove a voice you must unregister the voice then manually 
delete it from the 
<INSTALLATION DIRECTORY>\voices\<LANGUAGE>\<VOICENUMBER> 
directory. 
To relocate a voice or update a voice from a downloaded file you must 
register the location of the voice using the inivoice.exe utility. 

inivoice.exe [-u] <VOICENUMBER> <QUALIFIED PATH TO SYNTHINFO FILE>

For example, to move voice 1 from TTS's default installation path 
to F:\TTSVoices\us\1.  Move the data files and then invoke the 
following command:

C:>inivoice.exe 1 "F:\TTSVoices\us\1at8000KHz_1_0\synthinfo"

To unregister a voice with the system use the -u command.

C:>inivoice.exe -u 1 "F:\TTSVoices\us\1at8000KHz_1_0\synthinfo"


Note: Concatenative voices allow the following parameters to be 
adjusted at run time:
   - Volume
   - Pitch Baseline*
   - Speed
   - Pitch Fluctuation*
* Applies only to some voices

The following parameters are not changeable for concatenative voices:
   - Gender
   - Sample Rate (see section 4 above)
   - Head Size
   - Roughness
   - Breathiness

If a change is executed to one of the above (not changeable 
parameters), no error will occur and the voice synthesis will not 
change.

In concatenative TTS, when you change languages, the voice
characteristics are set to the default values for the currently 
active voice.  As a result, if you've modified the speed or volume,
and do a language change, the speed and volume will revert to the
default for the voice.  


12.  UNINSTALL INSTRUCTIONS
---------------------------
To uninstall the Text-to-Speech Run Time Kit: 

  Open Control Panel 
  Select Add Remove Programs
  Select the entry for IBM Text-to-Speech Runtime (for the appropriate 
  language)

You will be guided through the uninstall process.  


13.  GENERAL LIMITATIONS AND COMMENTS
-------------------------------------
This section contains information that is not specific to any 
particular element of the Text-to-Speech Run Time Kit but is general or 
generic in nature. It is very important to heed these warnings and 
follow the instructions given to avoid abnormal or unpredictable 
results.

*  Currently, only 8 KHz concatenative voices are provided. 
   Application programmers requiring higher quality audio should 
   upgrade their voice datasets.  For more information visit the IBM 
   Text-to-Speech home page.

*  Currently, Version 6.4.0.3 supports the following languages with 
   formant voices (Note: languages with a * denote formant and 
   concatenative voice support):
   
   Brazilian Portuguese*
   French*
   Canadian French* 
   Finnish
   German*
   United States English*
   United Kingdom English*
   Spanish*
   Mexican Spanish
   Italian*
   Chinese Simplified*
   Chinese Traditional*
   Japanese*
   

*  Currently, the included e-mail filter is only available for the 
   English language.

*  The email filter included with IBM Text-to-Speech recognizes the 
   following keywords in an email message:

Keyword                      Action
-------                      ------
Subject:	                 Parse out the subject of the message 
                             and return a new subject string to the 
                             client application.
To:	                       Filter out lines until a recognized 
                             keyword is encountered.
From:	                       Parse out the sender of the message 
                             and return a new string to the client 
                             application.
Date:	                       Parse out the date that the message 
                             was sent and return a new string with 
                             that date to the client application.
Sent:	                       Parse out the date that the message 
                             was sent and return a new string with 
                             that date to the client application.
Alternate-Recipient:	     Filter out the current line.
Mime-Version:	           Filter out the current line.
Return-Path:	           Filter out the current line.
MR-Received:	           Filter out the current line.
Content-Type:	           Filter out lines until a recognized 
                             keyword is encountered.
Content-Transfer-Encoding:   Filter out the current line.
Posting-Date:	           Filter out the current line.
Importance:	                 Filter out the current line.
Priority:	                 Filter out the current line.
Sensitivity:	           Filter out the current line.
UA-Content-ID:	           Filter out the current line.
X400-MTS-Identifier:	     Filter out the current line.
A1-Type:	                 Filter out the current line.
Hop-Count:	                 Filter out the current line.
Content-Disposition:	     Filter out the current line.
Delivered-To:	           Filter out the current line.
X-Originating-IP:	           Filter out the current line.
X-OriginalArrivalTime:	     Filter out the current line.
Full-Name:	                 Filter out the current line.
X-Mailer:	                 Filter out the current line.
CC:	                       Filter out the current line.
Filetime=	                 Filter out lines until a recognized 
                             keyword is encountered.
X-Apparently-To:	           Filter out the current line.
Content-Length:	           Filter out the current line.
Auto-Submitted:	           Filter out the current line
Status:	                 Filter out the current line
Received:	                 Filter out lines until a recognized
                             keyword is encountered.
   
*  The included e-mail filter will also filter the following "emoticons"
   from messages:

(R)   (C)   :-)   :-(   :-]   :)    ;)    :-#|  :(   :->   :-<   :-\\  
(-:   >:-<  :-|   :-o   :-c   |-)   |-O   :-#   :-%   :-&  :-'|  :-)'  
:-)8  :-* :-/   :-:   :-?   :-@   (:I   :-[   *:o)  +-(:-).-)  <:I   
@:I   [:-|] 8-#  8:-)  }(:-( :-{   :-{(  :-}   :-O   :-6   :-8(  :-9  
:-D   :-e   :-i   :-p :-t   :-v   ::-)  8-)   :<|   :=)   :>)   :~)   
;-)  %-)   (-)   (:-)  )8-) *-(   *<|:-)-:-)  ;-\\  =:-)  [:-)  O-)   
8-|   {(:-){:-)  <g>   <G>   

*  The eciUpdateFilter function for the included e-mail filter only 
   supports   changing the behavior for the "From:", "Date:", and 
   "Subject:" fields. 
   
*  The Text-to-Speech SDK includes a file "maildict.dct" that includes 
   translations for common e-mail jargon and abbreviations.  For best 
   results when processing  e-mail messages, this dictionary file 
   should be used in conjunction with the included e-mail filter.
   

=========
inifilter

The inifilter tool registers and unregisters filters which are used
as preprocessor addins for eci to modify text.

inifilter [-ul] /filter:[filterNum] /path:[filterPath] /autoload:[y/n]
 /lang:[lang] /ECIINI:[IniPath]

        -u              Disable specified filter
        -l              Display statistics about specified filter
        filter          Filter number
        path            Fully qualified filename of filter
        autoload        Filter is automatically loaded when language
                        selected
                        Valid values are:
                                n   Filter is not automatically loaded
                                y   Filter is automatically loaded
        lang            Language/Dialect for the filter
                        Valid language/dialect values are:
                                 1.0 - US English
                                 1.1 - British English
                                 2.0 - Castilian Spanish
                                 2.1 - Mexican Spanish
                                 3.0 - Standard French
                                 3.1 - Canadian French
                                 4.0 - Standard German
                                 5.0 - Standard Italian
                                 6.0 - Mandarin Chinese
                                 6.1 - Taiwanese Chinese
                                 7.0 - Brazilian Portuguese
                                 8.0 - Standard Japanese
                                 9.0 - Standard Finnish
                                13.0 - Standard Norwegian
                                14.0 - Standard Swedish
                                15.0 - Standard Danish
        ECIINI          Path to ECIINI file (not used on Windows
                        platforms) ECIINI environment variable used 
                        on other platforms if ommitted

NOTE: If -u is specified, only the language, filter and INI file may be 
      specified.



14.  KNOW PROBLEMS & F.A.Q.
--------------------------
The following are known problems that are included in this release:

*  If you are upgrading from TTS version 4.7 to TTS Version 6.4.0.3, 
   you will need to remove TTS version 4.7 prior to installing TTS 
   Version 6.4.0.3. 

*  On Windows XP, and Windows 2000 non-administrator users may receive 
   error messages pertaining to the InstallShield engine not being able 
   to  register. You will need to have the proper access permissions to 
   properly install.

*  On Windows XP, and Windows 2000 you must have proper access 
   permissions to run the command line tools (inicache, inifilter,
   inivoice, and initrace). If you do not have the proper access 
   permissions, there is no error message, and your changes will not
   be made. 

*  Setting the pitch baseline after setting head size may return an 
   error in certain situations.

*  The installation copies a large amount of data from the installation 
   media. During the copy process, very little screen activity is 
   visible. 

*  If multiple versions of TTS are to be installed on the same system, 
   you should install all versions of TTS to the same directory.


F.A.Q
-----
Q: Why is my application still synthesizing with format synthesis.

A: When you install an 8KHz voice the system will produce concatenative 
   synthesis for any application which requests synthesis at 8KHz.  By
   default the system generates audio at 11KHz.  In order to produce 
   concatenative speech use eciSetParam to set the sample rate. Also,
   check that version 5.0 was not installed after version 6.4.0.3 if both 
   version reside on the same machine.


15.  DEVELOPER NOTES
--------------------
*  The Text-to-Speech SDK is a good starting point for developing 
   applications.

*  Using SAPI programs with concatenative synthesis
   If you have an 8K concatenative voice installed, and you select a 
   SAPI voice that has been optimized for the telephone ("tel" in the 
   name and speaker fields, and 0x200 in the available feature field), 
   you will experience a delay while the concatenative voice data is 
   loaded into memory. This delay is considerably shortened the second 
   and subsequent times that you access the same voice, as the IBM 
   Concatenative Memory Manager (CMM) caches voices for a period of 
   time before flushing them from memory.

*  Concatenative Memory Manager (CMM) cmmcmd Utility
   A support utility called cmmcmd was created to interface with the
   Concatenative Memory Manager (CMM).
   Note : This is a support tool and was not intended to be an end 
   user utility.

   Invoke cmmcmd as follows:

   cmmcmd shutdown       -- shuts down the CMM 
   cmmcmd timeout ##     -- sets the CMM timeout to ## seconds


16.  Memory and Performance Tools
----------------------------------
Due to the computational complexity and amount of memory required to
produce concatenative speech, IBM Text-to-Speech utilizes shared memory 
and speech caching to reduce the amount of system resources required.

*  The concatenative TTS engine requires more physical memory (to store
   the data required to produce natural speech synthesis) than formant 
   synthesis.  Since many processes on a server may require access to 
   the same data, IBM Text-to-Speech loads and shares one instance 
   between all the processes.  In addition, IBM Text-to-Speech allows 
   configuration of how long a data will remain loaded after the last 
   access.  By default, each concatenative voice remains loaded for 10 
   minutes.  To configure and  stop sharing the memory the 
   Concatenative Memory Manager (CMM) utility, cmmcmd.exe, is provided:

   cmmcmd { shutdown | timeout [secs] }

      shutdown         - shut down the server immediately.

      timeout [secs]   - get/set the server time-out to the specified 
                         number of seconds.  If secs is 0 or omitted 
                         the current shut down time-out is returned.

*  The concatenative TTS engine requires more computational power than
   formant TTS engine.  Since the domains of many TTS applications are 
   limited to a small vocabulary, IBM Text-to-Speech now provides a 
   mechanism (speech caching) to bypass complex computations for text 
   which has already been processed.  The concatenative system can be
   configured, per language, to set a number of phrases 'to remember'
   as pre-synthesized phrases.  In addition, the memory can be made 
   persistent (that is, saved on exit and reloaded at voice 
   initialization).  By default no caching is performed.  To enable 
   and configure speech caching, the utility inicache.exe is provided:

inicache [-ul] [-p][-n] lang [phrases] [INI]

        -u      Disable voice caching
        -l      Display current voice cache values
        -p      Cache file is persistent (saved for future use)
        -n      Cache file is not persistent
        lang    Language/Dialect for the voice cache
                Valid language/dialect values are:
                         1.0  - US English
                         1.1  - British English
                         2.0  - Castilian Spanish
                         2.1  - Mexican Spanish
                         3.0  - Standard French
                         3.1  - Canadian French
                         4.0  - Standard German
                         5.0  - Standard Italian
                         6.0  - Simplified Chinese
                         6.0d - Simplified Chinese (dual language)
                         6.1  - Traditional Chinese
                         6.1d - Traditional Chinese (dual language)
                         7.0  - Brazilian Portuguese
                         8.0  - Standard Japanese
                         9.0  - Standard Finnish
                        13.0  - Standard Norwegian
                        14.0  - Standard Swedish
                        15.0  - Standard Danish
        phrases Maximum number of phrases in the voice cache
        INI     Path to ECIINI file (not used on Windows platforms)
                ECIINI environment variable used on other platforms if ommitted

NOTE: If -u is specified, only the language and INI file may be specified
 

17.  Logging Utilities
----------------------
Often logs must be produced for auditing, technical support, and 
diagnostic purposes.  The logging provided by IBM Text-to-Speech
is extremely verbose and is primarily for technical support and
diagnostic purposes.  To enable and configure the logging utility 
the utility initrace.exe is provided:

   initrace level [file] [INI]

       level   Tracing level [0 = off, 1 = on]

       file    Name of trace file
               Do not specify trace file if level is 0

       INI     Path to ECIINI file (not used on Windows platforms)
               ECIINI environment variable used on other platforms if 
               omitted

NOTE:  Paths that include spaces much be enclosed in double-quotes.
   

18.  TRADEMARK INFORMATION
--------------------------
IBM is a registered trademark or trademark of International Business
Machines Corporation in the United States and other countries.

Microsoft, Windows, Windows NT, Windows 95, Windows 98, Windows XP, 
and Windows 2000 logo are trademarks or registered trademarks of 
Microsoft Corporation in the United States and/or other countries.

All other names are registered trademarks, trademarks or service marks 
of their respective companies.


Doc Number: win32.readme.6.4.0.3.txt.050302