:KDW0DNHVD1RQQDWLYH$FFHQW"D6WXG\RI.RUHDQ(QJOLVK -RQJPL.LPDQG6X]DQQH)O\QQ 'HSDUWPHQWRI(QJOLVK/DQJXDJHDQG/LWHUDWXUH .DQJZRQ1DWLRQDO8QLYHUVLW\5HSXEOLFRI.RUHD NLPMP#NDQJZRQDFNU 'HSDUWPHQWRI/LQJXLVWLFVDQG3KLORVRSK\ 0DVVDFKXVHWWV,QVWLWXWHRI7HFKQRORJ\ VIO\QQ#PLWHGX $EVWUDFW :H UHSRUW D VHW RI UHVXOWV LQ WKH VHFRQG ODQJXDJH / DFTXLVLWLRQ RI (QJOLVK SKRQRORJ\ E\ ILUVW ODQJXDJH / VSHDNHUV RI .RUHDQ 6SHFLILFDOO\ ZH IRFXV RQ VLJQLILFDQW GLIIHUHQFHV LVRODWHG EHWZHHQ / VSHDNHUV¶ SURGXFWLRQ RI LVRODWHG ZRUGV LQ (QJOLVK DQG WKHLU SURGXFWLRQ RI WKH VDPH ZRUGV LQ VHQWHQFH DQG SKUDVDO FRQWH[WV 5HVXOWV LQGLFDWH VLJQLILFDQWO\ PRUH DFFXUDWH SURGXFWLRQ RI ZRUGV LQ LVRODWLRQ WKDQLQWKHSURGXFWLRQRIWKHVDPHZRUGVLQSKUDVDOFRQWH[WV 7KH SDUWLFXODU SKRQRORJLFDO SKHQRPHQD IRFXVHG RQ FRQFHUQ ERWKVWUHVVUHGXFWLRQDQGSODFHPHQW:HDOVRFRQVLGHUVHYHUDO RWKHU DVSHFWV RI VHJPHQWDO SKRQRORJ\ :H DUJXH WKDW WKH GLVFUHSDQF\ LQ UHVXOWV REVHUYHG EHWZHHQ WDVNV PD\ DFFRXQW IRUPDQ\RIWKHVHHPLQJO\GLVSDUDWHUHVXOWVLQGLFDWHGLQRWKHU VWXGLHV RI / SKRQRORJ\ :H GLVFXVV VHYHUDO SRVVLEOH H[SODQDWLRQVIRUWKHVHGDWDLQWHUPVRIZKLFKSURGXFWLRQWDVN PRVWFORVHO\SURYLGHVDPHDVXUHPHQWRIGHYHORSLQJOLQJXLVWLF FRPSHWHQFHDQGZKLFKPLJKWUHIOHFWWKHUROHRIHLWKHUJHQHUDO OHDUQLQJ VWUDWHJLHV RYHUJHQHUDOL]DWLRQ RU UHYHUVLRQ EDFN WR WKH / JUDPPDU XQGHU FRQGLWLRQV RI VWUHVV RU ZKHQ WKH / JUDPPDULVQRWIXOO\GHYHORSHG  ,QWURGXFWLRQ :KDW PDNHV D QRQQDWLYH DFFHQW" 6HYHUDO IDFWRUV PD\ FRQWULEXWHWRWKLV$QDFFHQWPD\LQYROYHVHJPHQWDOLQVHUWLRQ GHOHWLRQ DQGRU VXEVWLWXWLRQ 2Q WKH RWKHU KDQG DQ DFFHQW PD\EHGXHWRGLIIHUHQFHVEHWZHHQWKH/DQGWKH/SURVRGLF SDWWHUQV ZKLFK PD\ LQ WXUQ UHIOHFW GLIIHUHQFHV KDYLQJ WR GR ZLWKGXUDWLRQDPSOLWXGHDQGRUSLWFK$PRQJWKHVHGLIIHUHQW SRVVLELOLWLHVPRVWUHVHDUFKKDV IRFXVHGRQLVVXHVFRQFHUQLQJ VHJPHQWDO IDFWRUV )HZ VWXGLHV KDYH LQYHVWLJDWHG WKH / DFTXLVLWLRQ RI VWUHVV DQG SURVRG\ KRZHYHU VHH %URVHORZ $UFKLEDOG DQG &DUVRQ > @ DQG IHZHU LQYROYH DQ\ DFRXVWLF DQDO\VHV $V D FRQVHTXHQFH ZH DUH OHIW ZLWK D YHU\ LQFRPSOHWH XQGHUVWDQGLQJ RI ZKDW DFFRXQWV IRU D QRQQDWLYH DFFHQW ,QWKLVSDSHUZHFRQWULEXWHWRWKHVWXG\RIDFFHQWWKURXJK TXDQWLWDWLYHDFRXVWLFDQDO\VHVRIDODUJHGDWDEDVHFRQVLVWLQJRI UHFRUGHGVSHHFKVDPSOHVIURP/VSHDNHUVRI.RUHDQOHDUQLQJ (QJOLVK DV DQ / 2XU DQDO\VHV IRFXV RQSURVRGLF DVSHFWVRI WKH / VSHHFK VDPSOHV LQ D V\VWHPDWLF DQG FRQWUROOHG ZD\ :H EHOLHYH WKDW WKH FRQFOXVLRQV VXJJHVWHG E\ WKHVH GDWD FRQWULEXWH WR RXU XQGHUVWDQGLQJ RI WKH GHYHORSPHQW RI / SKRQRORJ\ $W RWKHU OHYHOV WKH UHVXOWV RI WKLV FXUUHQW VWXG\ FDQ XQLTXHO\ LQIRUP VSHHFK HQJLQHHULQJ DSSOLFDWLRQV DV ZHOO DV RWKHU VXFK HQGHDYRUV 7KH GDWD UHSRUWHG KHUH UHSUHVHQW DQ H[SDQVLRQ RI D UHVHDUFK SURMHFW WKDW LV FXUUHQWO\ EHLQJ LPSOHPHQWHGLQDQRQQDWLYHVSHHFKUHFRJQLWLRQV\VWHP>@  7KH/VWXG\ 7KH GDWD ZDV FROOHFWHG IURP  QDWLYH VSHDNHUV RI .RUHDQ OHDUQLQJ (QJOLVK DV DQ / DV ZHOO DV VL[ QDWLYH VSHDNHUV RI (QJOLVK 7KH VXEMHFWV ZHUH UHFRUGHG ZKLOH UHDGLQJ DORXG SKRQRORJLFDOO\ FRQWUROOHG (QJOLVK WH[WV $OO WKH / OHDUQHUV VSRNH WKH VWDQGDUG GLDOHFW RI .RUHD 7KH / OHYHO RI FRPSHWHQFH LQ (QJOLVK YDULHG DOWKRXJK PRVW RI WKH VXEMHFWV ZHUH DW WKH LQWHUPHGLDWH OHYHO DV GHWHUPLQHG E\ WKHLU DFDGHPLF VWDQGLQJ LQ D SURQXQFLDWLRQ FODVV 7KH IROORZLQJ GHVFULEHV WKH PHWKRGRORJ\ FKRVHQ IRU WKH PDMRU SDUW RI WKLV VWXG\WKDWLQYROYHVVXEMHFWV$VPDOOVXEVHWRIWKHGDWDZDV FROOHFWHGE\YDULRXVRWKHUPHWKRGVIRUDEDODQFHGVWXG\ 7KH PHWKRGRORJ\ RI WKLV VWXG\ FRQVLVWHG RI IRXU SDUWV )LUVW D QDWLYH VSHDNHU RI (QJOLVK ZDV FKRVHQ DV WKH PRGHO VSHDNHU 7KLV QDWLYH VSHDNHU ZDV  \HDUV ROG PDOH DQG D QDWLYH RI WKH VWDWH RI 8WDK 86$ +LV GLDOHFW ZDV VWDQGDUG $PHULFDQPLGZHVWHUQ 7KHVHFRQGSDUWRIWKLVWDVNLQYROYHGWKHUHFRUGLQJRIDOO WKHVWLPXOLE\WKHQDWLYHVSHDNHUQV7RGRWKLVWKHQVUHDG WKHH[SHULPHQWDOVWLPXOXVLWHPVLQDTXLWHURRP7KHUHDGLQJV ZHUHUHFRUGHGDQGGLJLWL]HGDWDN+]VDPSOLQJUDWHXVLQJ &RPSXWHUL]HG6SHHFK/DEE\.D\(OHPHWULFV,QF 7KH WKLUG SDUW RI WKH WDVN SURWRFRO LQYROYHG GLVWULEXWLQJ WKH PRGHO VSHHFK ILOHV WR DOO WKH SDUWLFLSDQWV LQ WKH VWXG\ LQ ERWK &' IRUP DQG DV GRZQORDGDEOH ILOHV IURP DQ LQWHUQHW ZHEVLWH 7KH SDUWLFLSDQWV ZHUH WROG WR OLVWHQ WR WKH PDWHULDOV DQG WR SUDFWLFH UHDGLQJ WKHP IRU DQ DYHUDJH RI D ZHHN 7KH OHDUQHUVZHUHDVNHGWRLPLWDWHWKHPRGHOVSHHFKDVFORVHO\DV SRVVLEOH ,Q DGGLWLRQ SULRU WR UHFRUGLQJ WKH OHDUQHUV ZHUH JLYHQ H[SOLFLW SKRQHWLF LQVWUXFWLRQ WKDW IRFXVHG RQ SRWHQWLDO SURQXQFLDWLRQGLIILFXOWLHVZLWKWKHVWLPXOL ,Q SDUW IRXU WKH  QDWLYH VSHDNHUV RI .RUHDQ ZHUH UHFRUGHG 7KH OHDUQHUV UHDG HDFK VWLPXOXV LWHP LQ D TXLHW URRP 7KH UHFRUGLQJV ZHUH DJDLQ GLJLWL]HG LQ  N+] VDPSOLQJ UDWH E\ &RPSXWHUL]HG 6SHHFK /DE E\ .D\ (OHPHWULFV([DPSOHVRIWKHVWLPXOXVLWHPVDUHLQ7DEOH 7DEOH([DPSOHVRI6WLPXOL1RWH&RQV &RQVRQDQWV,QWR,QWRQDWLRQ 3URSHUW\ ([DPSOH 9RZHO E>HD@WE>L@WE>DL@WE>H@WE>D@WE>R@WWE>RX@JKW &RQV >S@HHS>W@HDW>N@H\>E@HHS>G@HHG>J@HH 6\OODEOH >U@LFH>O@LFHED>WK@ED>WKH@>VS@DLQ>VSU@DLQ 6WUHVV >D@GG>D@GGLWLRQ:+>,@7(K>RX@VHZK>L@WH+>28@6( 5K\WKP >'HOLYHU@>ERRNV@>)ULGD\@>'HOLYHU@>ERRNVE\@ >)ULGD\@>'HOLYHUWKH@>ERRNVE\@>)ULGD\@ ,QWR 7HDFKLQJODQJXDJHVLVKDUGHUWKDQOHDUQLQJWKHP INTERSPEECH 2004 - ICSLP 8 th International Conference on Spoken Language Processing ICC Jeju, Jeju Island, Korea October 4-8, 2004 ISCA Archive http://www.isca-speech.org/archive 10.21437/Interspeech.2004-600