tag:blogger.com,1999:blog-9019034693239473022023-11-17T02:40:51.029+09:00 은전한닢 프로젝트 검색에서 쓸만한 오픈소스 한국어 형태소 분석기를 만들자!이용운http://www.blogger.com/profile/11207329648297372888noreply@blogger.comBlogger102125tag:blogger.com,1999:blog-901903469323947302.post-31349757010998800622018-07-21T04:36:00.001+09:002018-07-21T04:36:17.769+09:00mecab-ko-dic-2.1.1-20180720 배포합니다.<span style="background-color: #d9ead3;">mecab-ko-dic-2.1.1-20180720</span><span style="background-color: white;">이</span> 나왔습니다.<br />
<br />
<b>오류 수정</b><br />
<ul>
<li>NNG/장소의 출현 비용이 비정상적으로 높았던 문제(학습 데이터) 수정 후 재학습</li>
</ul>
<div>
<b>사전</b><br />
<ul>
<li>오피스/NNG/장소 추가</li>
</ul>
</div>
<a href="https://bitbucket.org/eunjeon/mecab-ko-dic/downloads/mecab-ko-dic-2.1.1-20180720.tar.gz" target="_blank">여기</a>에서 받으실 수 있습니다.이용운http://www.blogger.com/profile/11207329648297372888noreply@blogger.com6tag:blogger.com,1999:blog-901903469323947302.post-17824880395759101212018-07-17T11:17:00.001+09:002018-07-17T11:17:27.649+09:00mecab-ko-dic-2.1.0-20180716 배포합니다.<span style="background-color: #d9ead3;">mecab-ko-dic-2.1.0-20180716</span><span style="background-color: white;">이</span> 나왔습니다.<br />
<br />
<b>새로운 기능</b><br />
<ul>
<li>오분석을 줄이기 위해 NNG(보통 명사) 의미 부류 '장소', '행위', '상태변화', '정적사태'를 추가하고 재학습</li>
<ul>
<li>NNG(보통 명사)와 XSV(동사 파생 접미사), XSA(형용사 파생 접미사)와의 과도한 연결이 줄어들 것으로 기대</li>
</ul>
</ul>
<div>
<b>사전</b><br />
<ul>
<li>의미 부류에 '/'가 들어가 있는 경우 '|' 문자로 대체. ex: '성분부사/양태부사' -> '성분부사|양태부사'</li>
<li>그 밖에 자잘한 사전 오류 수정</li>
</ul>
</div>
<a href="https://bitbucket.org/eunjeon/mecab-ko-dic/downloads/mecab-ko-dic-2.1.0-20180716.tar.gz" target="_blank">여기</a>에서 받으실 수 있습니다.이용운http://www.blogger.com/profile/11207329648297372888noreply@blogger.com1tag:blogger.com,1999:blog-901903469323947302.post-60782546385702302062018-02-10T19:26:00.000+09:002018-02-10T19:26:55.111+09:00seunjeon-1.5.0/elasticsearch-analysis-seunjeon 6.1.1.1 배포합니다.<a href="https://bitbucket.org/vengadan_amzn/">vengadan</a> 의 contribution 으로 사전 압축 기능이 추가되었습니다. 낮은 성능의 인스턴스에서 실행시킬 수 있기를 기대합니다. <br />
그리고 line feed 단위로 분석을 수행하여 문서 크기가 크더라도 메모리 사용을 최소화합니다.<br />
전체적으로 heap 메모리 사용에 신경을 썼습니다.<br />
앞으로도 고민할만한 곳이 많이 있는 것 같으나.. 시간이 부족하여 이후에 시간날 때 천천히 진행해보겠습니다.<br />
<br />
메뉴얼:<br />
* <a href="https://bitbucket.org/eunjeon/seunjeon/src/d2c16421897cd76ef2edab451d674d81b0a14a00?at=master">elasticsearch plugin</a><br />
* <a href="https://bitbucket.org/eunjeon/seunjeon">seunjeon</a>유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com1tag:blogger.com,1999:blog-901903469323947302.post-2179906725178948992017-09-10T16:22:00.000+09:002017-09-10T16:23:26.930+09:00elasticsearch-analysis-seunjeon 5.4.1.0 배포합니다.<div>
오랜만의 작업입니다.</div>
<div>
<br /></div>
[변경사항]<br />
<div>
* 인덱스별로 사용자사전을 따로 로드하도록 변경하였습니다. 기존에 사전을 singleton으로 관리하다보니 변경된 사용자 사전을 적용하려면 elasticsearch를 재시작해야 했습니다. 이제는 사전 변경 후 인덱스를 새로 생성하면 analyzer instance가 만들어지면서 사용자사전 instance도 새로 만들어집니다. 아무래도 단점은 인덱스별로 사용자사전이 로드되다보니 사전 크기만큼 메모리 사용이 더 많아질 수 있습니다. 단, 시스템 사전은 singleton입니다.<br />
<br />
README: <a href="https://bitbucket.org/eunjeon/seunjeon/raw/master/elasticsearch/">https://bitbucket.org/eunjeon/seunjeon/raw/master/elasticsearch/</a></div>
유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-24734764730891726702017-06-04T20:27:00.002+09:002017-06-16T22:19:50.095+09:00elasticsearch-anaysis-seunjeon 다운로더elasticsearch 버전이 새로 릴리즈 될때마다 plugin도 새롭게 릴리즈를 해줘야만 설치가 되는 문제가 있었습니다. 플러그인 압축파일 안에 plugin-descriptor.properties 파일이 있는데 그 파일 내에 버전정보가 다르면 설치가 되지않아서 생기는 문제였습니다. 플러그인 개발을 주업으로 하지않는 저희 입장에서는 매번 지원을 해드리는게 어려운 일이 아닐수 없었습니다. 매번 릴리즈 하지 않고 plugin-descriptor.properties 파일 내 버전 정보를 변경해 줄수있는 스크립트를 작성했습니다. 아래처럼 사용하시면 새로운 elasticsearch가 나오더라도 언제든지 seunjeon 설치가 가능합니다. (단, elasticsearch interface의 변화가 없다면..)<br />
<br />
<br />
<div class="codehilite language-bash" style="background-color: white; color: #172b4d; font-family: -apple-system, system-ui, "Segoe UI", Roboto, "Noto Sans", Oxygen, Ubuntu, "Droid Sans", "Helvetica Neue", sans-serif; font-size: 14px; letter-spacing: -0.07px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: Consolas, Menlo, "Liberation Mono", Courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;"><span class="c1" style="color: #999988; font-style: italic;"># download plugin</span>
$ bash <<span class="o" style="font-weight: 700;">(</span>curl -s https://bitbucket.org/eunjeon/seunjeon/raw/master/elasticsearch/scripts/downloader.sh<span class="o" style="font-weight: 700;">)</span> -e <es-version> -p <plugin-version>
<span class="c1" style="color: #999988; font-style: italic;"># install plugin</span>
$ ./bin/elasticsearch-plugin install file://<span class="sb" style="color: #bb8844;">`</span><span class="nb" style="color: #999999;">pwd</span><span class="sb" style="color: #bb8844;">`</span>/elasticsearch-analysis-seunjeon-<plugin-version>.zip
</pre>
</div>
<ul style="background-color: white; color: #172b4d; font-family: -apple-system, system-ui, "Segoe UI", Roboto, "Noto Sans", Oxygen, Ubuntu, "Droid Sans", "Helvetica Neue", sans-serif; font-size: 14px; letter-spacing: -0.07px; margin: 12px 0px 0px; padding: 0px 0px 0px 40px;">
<li style="word-wrap: break-word;">downloader.sh 가 하는 일은 elasticsearch-analysis-seunjeon-<plugin-version>.zip 파일을 내려받은 후 plugin-descriptor.properties 의 elasticsearch.version 을 변경하여 재압축합니다.</li>
<li style="margin: 0px; word-wrap: break-word;">elasticsearch가 버전 업 될때마다 플러그인을 재배포하는데 어려움이 있어 스크립트를 제공합니다.</li>
</ul>
유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com2tag:blogger.com,1999:blog-901903469323947302.post-9655505690604515112016-12-25T02:05:00.001+09:002016-12-25T02:05:52.746+09:00mecab-ko-lucene-analyzer-0.21.0, elasticsesarch-analysis-mecab-ko-5.1.1.0 배포합니다.<span style="background-color: #d9ead3;">mecab-ko-lucene-analyzer-0.21.0, elasticsesarch-analysis-mecab-ko-5.1.1.0</span>이 나왔습니다.<br />
<br />
<b>새로운 기능</b><br />
<ul>
<li></li>
<li>Lucene/Solr 6.3.0 지원</li>
<li>Elasticsearch 5.1.1 지원 - <a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/issues/6/elasticsearch-511" rel="nofollow" target="_blank">issue #6</a></li>
</ul>
<b>수정</b><br />
<ul>
<li>로딩 실패시 UnsatisfiedLinkError throw 하도록 수정 - <a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/issues/5/mecabloader" rel="nofollow" target="_blank">issue #5</a></li>
</ul>
<div>
소스 패치를 보내주신 Jaepil Jeong 님과 이슈 등록해주신 devimapreduce 님께 감사드립니다.</div>
<div>
<br /></div>
<div>
<div>
Lucene/Solr 용 분석기</div>
<div>
<ul>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/downloads/mecab-ko-lucene-analyzer-0.21.0.tar.gz" target="_blank">다운로드</a></li>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/src/2b345772535f081a3445da97d8738d03ed011ca0/?at=release-0.21.0" rel="nofollow" target="_blank">설<span id="goog_2082698774"></span><span id="goog_2082698775"></span>치방법</a></li>
</ul>
</div>
<div>
Elasticsearch 용 분석기</div>
<div>
<ul>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/downloads/elasticsearch-analysis-mecab-ko-5.1.1.0.zip" target="_blank">다운로드</a></li>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/src/2b345772535f081a3445da97d8738d03ed011ca0/elasticsearch-analysis-mecab-ko/?at=release-0.21.0" rel="nofollow" target="_blank">설치방법</a></li>
</ul>
</div>
</div>
이용운http://www.blogger.com/profile/11207329648297372888noreply@blogger.com1tag:blogger.com,1999:blog-901903469323947302.post-45697478624765468362016-12-24T13:13:00.000+09:002016-12-24T13:13:17.134+09:00seunjeon1.3.0 / elasticsearch-analysis-seunjeon 5.1.1.1 릴리즈합니다.사용자 사전에 복합명사 추가가 가능해졌습니다.<br />
<br />
'+' 로 복합명사를 표현할 수 있습니다.<br />
예: "낄끼+빠빠"<br />
escaping은 '\'로 가능합니다.<br />
예: "c\+\+"<br />
<h3>
seunjeon</h3>
<div>
<ul>
<li>https://bitbucket.org/eunjeon/seunjeon</li>
</ul>
</div>
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, Menlo, "Liberation Mono", Courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;"><span class="k" style="font-weight: 700;">import</span> <span class="nn" style="color: #555555;">org.bitbucket.eunjeon.seunjeon.Analyzer</span>
<span class="c1" style="color: #999988; font-style: italic;">// 형태소 분석</span>
<span class="nc" style="color: #445588; font-weight: 700;">Analyzer</span><span class="o" style="font-weight: 700;">.</span><span class="n">parse</span><span class="o" style="font-weight: 700;">(</span><span class="s" style="color: #bb8844;">"아버지가방에들어가신다."</span><span class="o" style="font-weight: 700;">).</span><span class="n">foreach</span><span class="o" style="font-weight: 700;">(</span><span class="n">println</span><span class="o" style="font-weight: 700;">)</span>
<span class="c1" style="color: #999988; font-style: italic;">// 어절 분석</span>
<span class="nc" style="color: #445588; font-weight: 700;">Analyzer</span><span class="o" style="font-weight: 700;">.</span><span class="n">parseEojeol</span><span class="o" style="font-weight: 700;">(</span><span class="s" style="color: #bb8844;">"아버지가방에들어가신다."</span><span class="o" style="font-weight: 700;">).</span><span class="n">foreach</span><span class="o" style="font-weight: 700;">(</span><span class="n">println</span><span class="o" style="font-weight: 700;">)</span>
<span class="c1" style="color: #999988; font-style: italic;">// or</span>
<span class="nc" style="color: #445588; font-weight: 700;">Analyzer</span><span class="o" style="font-weight: 700;">.</span><span class="n">parseEojeol</span><span class="o" style="font-weight: 700;">(</span><span class="nc" style="color: #445588; font-weight: 700;">Analyzer</span><span class="o" style="font-weight: 700;">.</span><span class="n">parse</span><span class="o" style="font-weight: 700;">(</span><span class="s" style="color: #bb8844;">"아버지가방에들어가신다."</span><span class="o" style="font-weight: 700;">)).</span><span class="n">foreach</span><span class="o" style="font-weight: 700;">(</span><span class="n">println</span><span class="o" style="font-weight: 700;">)</span>
<span class="cm" style="color: #999988; font-style: italic;">/**</span>
<span class="cm" style="color: #999988; font-style: italic;"> * 사용자 사전 추가</span>
<span class="cm" style="color: #999988; font-style: italic;"> * surface,cost</span>
<span class="cm" style="color: #999988; font-style: italic;"> * surface: 단어명. '+' 로 복합명사를 구성할 수 있다.</span>
<span class="cm" style="color: #999988; font-style: italic;"> * '+'문자 자체를 사전에 등록하기 위해서는 '\+'로 입력. 예를 들어 'C\+\+'</span>
<span class="cm" style="color: #999988; font-style: italic;"> * cost: 단어 출연 비용. 작을수록 출연할 확률이 높다.</span>
<span class="cm" style="color: #999988; font-style: italic;"> */</span>
<span class="nc" style="color: #445588; font-weight: 700;">Analyzer</span><span class="o" style="font-weight: 700;">.</span><span class="n">setUserDict</span><span class="o" style="font-weight: 700;">(</span><span class="nc" style="color: #445588; font-weight: 700;">Seq</span><span class="o" style="font-weight: 700;">(</span><span class="s" style="color: #bb8844;">"덕후"</span><span class="o" style="font-weight: 700;">,</span> <span class="s" style="color: #bb8844;">"버카충,-100"</span><span class="o" style="font-weight: 700;">,</span> <span class="s" style="color: #bb8844;">"낄끼+빠빠,-100"</span><span class="o" style="font-weight: 700;">,</span> <span class="s" style="color: #bb8844;">"""C\+\+"""</span><span class="o" style="font-weight: 700;">).</span><span class="n">toIterator</span><span class="o" style="font-weight: 700;">)</span>
<span class="nc" style="color: #445588; font-weight: 700;">Analyzer</span><span class="o" style="font-weight: 700;">.</span><span class="n">parse</span><span class="o" style="font-weight: 700;">(</span><span class="s" style="color: #bb8844;">"덕후냄새가 난다."</span><span class="o" style="font-weight: 700;">).</span><span class="n">foreach</span><span class="o" style="font-weight: 700;">(</span><span class="n">println</span><span class="o" style="font-weight: 700;">)</span>
<span class="c1" style="color: #999988; font-style: italic;">// 활용어 원형</span>
<span class="nc" style="color: #445588; font-weight: 700;">Analyzer</span><span class="o" style="font-weight: 700;">.</span><span class="n">parse</span><span class="o" style="font-weight: 700;">(</span><span class="s" style="color: #bb8844;">"빨라짐"</span><span class="o" style="font-weight: 700;">).</span><span class="n">flatMap</span><span class="o" style="font-weight: 700;">(</span><span class="k" style="font-weight: 700;">_</span><span class="o" style="font-weight: 700;">.</span><span class="n">deInflect</span><span class="o" style="font-weight: 700;">()).</span><span class="n">foreach</span><span class="o" style="font-weight: 700;">(</span><span class="n">println</span><span class="o" style="font-weight: 700;">)</span>
<span class="c1" style="color: #999988; font-style: italic;">// 복합명사 분해</span>
<span class="k" style="font-weight: 700;">val</span> <span class="n">ggilggi</span> <span class="k" style="font-weight: 700;">=</span> <span class="nc" style="color: #445588; font-weight: 700;">Analyzer</span><span class="o" style="font-weight: 700;">.</span><span class="n">parse</span><span class="o" style="font-weight: 700;">(</span><span class="s" style="color: #bb8844;">"낄끼빠빠"</span><span class="o" style="font-weight: 700;">)</span>
<span class="n">ggilggi</span><span class="o" style="font-weight: 700;">.</span><span class="n">foreach</span><span class="o" style="font-weight: 700;">(</span><span class="n">println</span><span class="o" style="font-weight: 700;">)</span> <span class="c1" style="color: #999988; font-style: italic;">// 낄끼빠빠</span>
<span class="n">ggilggi</span><span class="o" style="font-weight: 700;">.</span><span class="n">flatMap</span><span class="o" style="font-weight: 700;">(</span><span class="k" style="font-weight: 700;">_</span><span class="o" style="font-weight: 700;">.</span><span class="n">deCompound</span><span class="o" style="font-weight: 700;">()).</span><span class="n">foreach</span><span class="o" style="font-weight: 700;">(</span><span class="n">println</span><span class="o" style="font-weight: 700;">)</span> <span class="c1" style="color: #999988; font-style: italic;">// 낄끼+빠빠</span>
<span class="nc" style="color: #445588; font-weight: 700;">Analyzer</span><span class="o" style="font-weight: 700;">.</span><span class="n">parse</span><span class="o" style="font-weight: 700;">(</span><span class="s" style="color: #bb8844;">"C++"</span><span class="o" style="font-weight: 700;">).</span><span class="n">flatMap</span><span class="o" style="font-weight: 700;">(</span><span class="k" style="font-weight: 700;">_</span><span class="o" style="font-weight: 700;">.</span><span class="n">deInflect</span><span class="o" style="font-weight: 700;">()).</span><span class="n">foreach</span><span class="o" style="font-weight: 700;">(</span><span class="n">println</span><span class="o" style="font-weight: 700;">)</span> <span class="c1" style="color: #999988; font-style: italic;">// C++</span></pre>
<br />
<h3>
analysis-seunjeon</h3>
<div>
<ul>
<li>https://bitbucket.org/eunjeon/seunjeon/raw/master/elasticsearch/</li>
</ul>
</div>
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); color: #333333; font-family: Consolas, Menlo, "Liberation Mono", Courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;"><span class="ch">#!/usr/bin/env bash</span>
<span class="nv" style="color: teal;">ES</span><span class="o" style="font-weight: 700;">=</span><span class="s1" style="color: #bb8844;">'http://localhost:9200'</span>
<span class="nv" style="color: teal;">ESIDX</span><span class="o" style="font-weight: 700;">=</span><span class="s1" style="color: #bb8844;">'seunjeon-idx'</span>
curl -XDELETE <span class="s2" style="color: #bb8844;">"</span><span class="si" style="color: #bb8844;">${</span><span class="nv" style="color: teal;">ES</span><span class="si" style="color: #bb8844;">}</span><span class="s2" style="color: #bb8844;">/</span><span class="si" style="color: #bb8844;">${</span><span class="nv" style="color: teal;">ESIDX</span><span class="si" style="color: #bb8844;">}</span><span class="s2" style="color: #bb8844;">?pretty"</span>
sleep 1
curl -XPUT <span class="s2" style="color: #bb8844;">"</span><span class="si" style="color: #bb8844;">${</span><span class="nv" style="color: teal;">ES</span><span class="si" style="color: #bb8844;">}</span><span class="s2" style="color: #bb8844;">/</span><span class="si" style="color: #bb8844;">${</span><span class="nv" style="color: teal;">ESIDX</span><span class="si" style="color: #bb8844;">}</span><span class="s2" style="color: #bb8844;">/?pretty"</span> -d <span class="s1" style="color: #bb8844;">'{</span>
<span class="s1" style="color: #bb8844;"> "settings" : {</span>
<span class="s1" style="color: #bb8844;"> "index":{</span>
<span class="s1" style="color: #bb8844;"> "analysis":{</span>
<span class="s1" style="color: #bb8844;"> "analyzer":{</span>
<span class="s1" style="color: #bb8844;"> "korean":{</span>
<span class="s1" style="color: #bb8844;"> "type":"custom",</span>
<span class="s1" style="color: #bb8844;"> "tokenizer":"seunjeon_default_tokenizer"</span>
<span class="s1" style="color: #bb8844;"> }</span>
<span class="s1" style="color: #bb8844;"> },</span>
<span class="s1" style="color: #bb8844;"> "tokenizer": {</span>
<span class="s1" style="color: #bb8844;"> "seunjeon_default_tokenizer": {</span>
<span class="s1" style="color: #bb8844;"> "type": "seunjeon_tokenizer",</span>
<span class="s1" style="color: #bb8844;"> "index_eojeol": false,</span>
<span class="s1" style="color: #bb8844;"> "user_words": ["낄끼+빠빠,-100", "c\\+\\+", "어그로", "버카충", "abc마트"]</span>
<span class="s1" style="color: #bb8844;"> }</span>
<span class="s1" style="color: #bb8844;"> }</span>
<span class="s1" style="color: #bb8844;"> }</span>
<span class="s1" style="color: #bb8844;"> }</span>
<span class="s1" style="color: #bb8844;"> }</span>
<span class="s1" style="color: #bb8844;">}'</span>
sleep 1
<span class="nb" style="color: #999999;">echo</span> <span class="s2" style="color: #bb8844;">"# 삼성/N 전자/N"</span>
curl -XGET <span class="s2" style="color: #bb8844;">"</span><span class="si" style="color: #bb8844;">${</span><span class="nv" style="color: teal;">ES</span><span class="si" style="color: #bb8844;">}</span><span class="s2" style="color: #bb8844;">/</span><span class="si" style="color: #bb8844;">${</span><span class="nv" style="color: teal;">ESIDX</span><span class="si" style="color: #bb8844;">}</span><span class="s2" style="color: #bb8844;">/_analyze?analyzer=korean&pretty"</span> -d <span class="s1" style="color: #bb8844;">'삼성전자'</span>
<span class="nb" style="color: #999999;">echo</span> <span class="s2" style="color: #bb8844;">"# 빠르/V 지/V"</span>
curl -XGET <span class="s2" style="color: #bb8844;">"</span><span class="si" style="color: #bb8844;">${</span><span class="nv" style="color: teal;">ES</span><span class="si" style="color: #bb8844;">}</span><span class="s2" style="color: #bb8844;">/</span><span class="si" style="color: #bb8844;">${</span><span class="nv" style="color: teal;">ESIDX</span><span class="si" style="color: #bb8844;">}</span><span class="s2" style="color: #bb8844;">/_analyze?analyzer=korean&pretty"</span> -d <span class="s1" style="color: #bb8844;">'빨라짐'</span>
<span class="nb" style="color: #999999;">echo</span> <span class="s2" style="color: #bb8844;">"# 슬프/V"</span>
curl -XGET <span class="s2" style="color: #bb8844;">"</span><span class="si" style="color: #bb8844;">${</span><span class="nv" style="color: teal;">ES</span><span class="si" style="color: #bb8844;">}</span><span class="s2" style="color: #bb8844;">/</span><span class="si" style="color: #bb8844;">${</span><span class="nv" style="color: teal;">ESIDX</span><span class="si" style="color: #bb8844;">}</span><span class="s2" style="color: #bb8844;">/_analyze?analyzer=korean&pretty"</span> -d <span class="s1" style="color: #bb8844;">'슬픈'</span>
<span class="nb" style="color: #999999;">echo</span> <span class="s2" style="color: #bb8844;">"# 새롭/V 사전/N 생성/N"</span>
curl -XGET <span class="s2" style="color: #bb8844;">"</span><span class="si" style="color: #bb8844;">${</span><span class="nv" style="color: teal;">ES</span><span class="si" style="color: #bb8844;">}</span><span class="s2" style="color: #bb8844;">/</span><span class="si" style="color: #bb8844;">${</span><span class="nv" style="color: teal;">ESIDX</span><span class="si" style="color: #bb8844;">}</span><span class="s2" style="color: #bb8844;">/_analyze?analyzer=korean&pretty"</span> -d <span class="s1" style="color: #bb8844;">'새로운사전생성'</span>
<span class="nb" style="color: #999999;">echo</span> <span class="s2" style="color: #bb8844;">"# 낄끼/N 빠빠/N c++/N"</span>
curl -XGET <span class="s2" style="color: #bb8844;">"</span><span class="si" style="color: #bb8844;">${</span><span class="nv" style="color: teal;">ES</span><span class="si" style="color: #bb8844;">}</span><span class="s2" style="color: #bb8844;">/</span><span class="si" style="color: #bb8844;">${</span><span class="nv" style="color: teal;">ESIDX</span><span class="si" style="color: #bb8844;">}</span><span class="s2" style="color: #bb8844;">/_analyze?analyzer=korean&pretty"</span> -d <span class="s1" style="color: #bb8844;">'낄끼빠빠 c++'</span></pre>
<br />
<br />
<br />
<br />
<br />유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com1tag:blogger.com,1999:blog-901903469323947302.post-76063247374179480352016-12-15T22:07:00.001+09:002016-12-15T22:07:27.838+09:00elasticsearch-analysis-seunjeon 5.1.1.0 배포합니다.기능상 변경 없이 elasticsearch 5.1.1 버전에 맞게 새로 빌드하였습니다.<br />
<br />
앞으로는 기능추가 없이 단순 elasticsearch 버전에 맞추는 릴리즈는 따로 공지하지 않겠습니다.<br />
릴리즈 목록은 <a href="https://bitbucket.org/eunjeon/seunjeon/raw/master/elasticsearch/">bitbucket README</a> 에 명시하겠습니다.<br />
<br />
감사합니다.유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-81157143292671378822016-11-06T18:01:00.000+09:002016-11-06T18:01:07.173+09:00elasticsearch-analysis-seunjeon 5.0.0.0 배포합니다.<a href="https://www.elastic.co/guide/en/elasticsearch/reference/5.x/index.html">elasticsearch 5.0.0</a> 지원 plugin 배포합니다. 추가된 기능은 없습니다. <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/setup.html">elastic팀 가이드</a>대로 <a href="http://www.oracle.com/technetwork/java/javase/downloads/index.html">java 8</a> 이상에서 실행해야 합니다.<br />
<br />
추가나 수정된 기능은 없습니다.<br />
<br />
<div>
<h4>
소스 및 메뉴얼</h4>
<div>
</div>
</div>
<br />
<div>
</div>
<br />
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: "Apple SD Gothic Neo"; font-size: medium; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
<div style="margin: 0px;">
<a href="https://bitbucket.org/eunjeon/seunjeon/raw/master/elasticsearch/">https://bitbucket.org/eunjeon/seunjeon/raw/master/elasticsearch/</a></div>
<h2 id="markdown-header-" style="background-color: white; color: #333333; font-family: Arial, sans-serif; font-size: 16px; line-height: 1.5625; margin: 30px 0px 0px; padding: 0px;">
설치</h2>
<div class="codehilite" style="background-color: white; color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: Consolas, Menlo, "Liberation Mono", Courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;">./bin/elasticsearch-plugin install https://oss.sonatype.org/service/local/repositories/releases/content/org/bitbucket/eunjeon/elasticsearch-analysis-seunjeon/5.0.0.0/elasticsearch-analysis-seunjeon-5.0.0.0.zip
</pre>
</div>
</div>
유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com3tag:blogger.com,1999:blog-901903469323947302.post-25935801514952723382016-11-06T17:51:00.001+09:002016-11-06T17:51:07.809+09:00seunjeon-1.2.0 배포합니다.추가된 기능은 없습니다. scala 2.12을 새롭게 지원합니다. 호환성 문제로 scala2.10은 더이상 지원하지 않습니다. 2.10을 사용하시려면 seunjeon-1.1.1 을 사용하시면 됩니다.<br />
<br />
<br />
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: "Apple SD Gothic Neo"; font-size: medium; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
</div>
<br />
<ul style="-webkit-text-stroke-width: 0px; color: black; font-family: "Apple SD Gothic Neo"; font-size: medium; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
<li>소스 및 메뉴얼: <a href="https://bitbucket.org/eunjeon/seunjeon">https://bitbucket.org/eunjeon/seunjeon</a></li>
<li>버전별 호환되는 scala & jvm</li>
</ul>
<table style="background-color: white; border-collapse: collapse; color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-top: 10px; width: 626px;"><thead style="border-bottom: 1px solid rgb(204, 204, 204);">
<tr><th style="padding: 7px 10px; text-align: left; vertical-align: top;">version</th><th style="padding: 7px 10px; text-align: left; vertical-align: top;">scala(java)</th><th style="padding: 7px 10px; text-align: left; vertical-align: top;">note</th></tr>
</thead><tbody>
<tr style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border-bottom: 1px solid rgb(204, 204, 204);"><td style="border-top: 1px solid rgb(204, 204, 204); padding: 7px 10px; vertical-align: top;">1.2.0</td><td style="border-top: 1px solid rgb(204, 204, 204); padding: 7px 10px; vertical-align: top;">2.11(1.7), 2.12(1.8)</td><td style="border-top: 1px solid rgb(204, 204, 204); padding: 7px 10px; vertical-align: top;">추가기능 없음</td></tr>
<tr style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border-bottom: 1px solid rgb(204, 204, 204);"><td style="padding: 7px 10px; vertical-align: top;">1.1.1</td><td style="padding: 7px 10px; vertical-align: top;">2.10(1.7), 2.11(1.7)</td><td style="padding: 7px 10px; vertical-align: top;"><br /></td></tr>
</tbody></table>
<br />
<h3 id="markdown-header-maven" style="background-color: white; color: #333333; font-family: Arial, sans-serif; font-size: 16px; line-height: 1.5625; margin: 30px 0px 0px; padding: 0px;">
Maven</h3>
<div class="codehilite" style="background-color: white; color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: Consolas, Menlo, "Liberation Mono", Courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;"><span class="nt" style="color: navy;"><dependencies></span>
<span class="nt" style="color: navy;"><dependency></span>
<span class="nt" style="color: navy;"><groupId></span>org.bitbucket.eunjeon<span class="nt" style="color: navy;"></groupId></span>
<span class="nt" style="color: navy;"><artifactId></span>seunjeon_2.11<span class="nt" style="color: navy;"></artifactId></span>
<span class="nt" style="color: navy;"><version></span>1.2.0<span class="nt" style="color: navy;"></version></span>
<span class="nt" style="color: navy;"></dependency></span>
<span class="nt" style="color: navy;"></dependencies></span>
</pre>
</div>
<h3 id="markdown-header-sbt" style="background-color: white; color: #333333; font-family: Arial, sans-serif; font-size: 16px; line-height: 1.5625; margin: 30px 0px 0px; padding: 0px;">
SBT</h3>
<div class="codehilite" style="background-color: white; color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: Consolas, Menlo, "Liberation Mono", Courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;"><span class="n">libraryDependencies</span> <span class="o" style="font-weight: 700;">+=</span> <span class="s" style="color: #bb8844;">"org.bitbucket.eunjeon"</span> <span class="o" style="font-weight: 700;">%%</span> <span class="s" style="color: #bb8844;">"seunjeon"</span> <span class="o" style="font-weight: 700;">%</span> <span class="s" style="color: #bb8844;">"1.2.0"</span>
</pre>
</div>
<h4>
변경내용</h4>
<div>
<ul>
<li>scala-csv 라이브러리 제거.</li>
<li>scala2.12 지원</li>
</ul>
</div>
유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-31817980690321141292016-09-28T01:01:00.000+09:002016-09-28T01:01:01.057+09:00elasticsearch-analysis-seunjeon 2.4.0.1 배포합니다.elasticsearch plugin 2.4.0.1 배포합니다.<br />
<br />
<h4>
수정내용:</h4>
<div>
몇몇 단어에대해 deinflect 안되는 문제 해결.</div>
<div>
<br /></div>
<div>
<div>
<h4>
소스 및 메뉴얼</h4>
<div>
</div>
</div>
<div>
<a href="https://bitbucket.org/eunjeon/seunjeon/raw/master/elasticsearch/">https://bitbucket.org/eunjeon/seunjeon/raw/master/elasticsearch/</a></div>
<div>
<br /></div>
<div>
<h4>
설치</h4>
<div>
<div class="codehilite" style="background-color: white; color: #333333; font-family: arial, sans-serif; font-size: 14px; line-height: 20px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: consolas, menlo, "liberation mono", courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;">./bin/plugin install org.bitbucket.eunjeon/elasticsearch-analysis-seunjeon/2.4.0.1
</pre>
<div>
<br /></div>
<div>
</div>
</div>
</div>
</div>
</div>
유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com1tag:blogger.com,1999:blog-901903469323947302.post-23068556068524885562016-09-28T00:57:00.003+09:002016-09-28T00:57:45.837+09:00seunjeon-1.1.1 배포합니다.버그 수정합니다. 일부 활용어(Inflect) 중 deinflect 가 되지 않는 문제 해결.<br />
<div>
<ul>
<li>소스 및 메뉴얼: <a href="https://bitbucket.org/eunjeon/seunjeon">https://bitbucket.org/eunjeon/seunjeon</a></li>
</ul>
<div>
<h2 id="markdown-header-maven" style="background-color: white; color: #333333; font-family: arial, sans-serif; font-size: 16px; line-height: 1.5625; margin: 30px 0px 0px; padding: 0px;">
Maven</h2>
<div class="codehilite" style="background-color: white; color: #333333; font-family: arial, sans-serif; font-size: 14px; line-height: 20px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: consolas, menlo, 'liberation mono', courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;"><span class="nt" style="color: navy;"><dependencies></span>
<span class="nt" style="color: navy;"><dependency></span>
<span class="nt" style="color: navy;"><groupId></span>org.bitbucket.eunjeon<span class="nt" style="color: navy;"></groupId></span>
<span class="nt" style="color: navy;"><artifactId></span>seunjeon_2.11<span class="nt" style="color: navy;"></artifactId></span>
<span class="nt" style="color: navy;"><version></span>1.1.1<span class="nt" style="color: navy;"></version></span>
<span class="nt" style="color: navy;"></dependency></span>
<span class="nt" style="color: navy;"></dependencies></span>
</pre>
</div>
<h2 id="markdown-header-sbt" style="background-color: white; color: #333333; font-family: arial, sans-serif; font-size: 16px; line-height: 1.5625; margin: 30px 0px 0px; padding: 0px;">
SBT</h2>
<div class="codehilite" style="background-color: white; color: #333333; font-family: arial, sans-serif; font-size: 14px; line-height: 20px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: consolas, menlo, 'liberation mono', courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;"><span class="n">libraryDependencies</span> <span class="o" style="font-weight: 700;">+=</span> <span class="s" style="color: #bb8844;">"org.bitbucket.eunjeon"</span> <span class="o" style="font-weight: 700;">%%</span> <span class="s" style="color: #bb8844;">"seunjeon"</span> <span class="o" style="font-weight: 700;">%</span> <span class="s" style="color: #bb8844;">"1.1.1"</span>
</pre>
</div>
<div>
<br /></div>
</div>
</div>
<div>
<h4>
변경내용</h4>
</div>
<div>
<ul>
<li>deinflect 안되는 문제 해결.</li>
<ul>
<li>https://groups.google.com/forum/#!topic/eunjeon/H2yLE97pNt8</li>
<li>https://bitbucket.org/eunjeon/seunjeon/issues/6/es-index-analyzer-indexing</li>
</ul>
</ul>
</div>
유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-27420773360014633682016-09-11T16:08:00.001+09:002016-09-11T16:09:32.504+09:00elasticsearch-analysis-seunjeon 2.3.5.0 / 2.4.0.0 배포합니다.elasticsearch 2.3.5 / 2.4.0 용 플러그인 배포합니다.<br />
<br />
<h4>
수정내용:</h4>
<div>
없음.</div>
<div>
<br /></div>
<div>
<div>
<h4>
소스 및 메뉴얼</h4>
<div>
</div>
</div>
<div>
<a href="https://bitbucket.org/eunjeon/seunjeon/src/6104619359f9b00bf7f5cf34a9b056f1cfa5ae43/elasticsearch?at=master">https://bitbucket.org/eunjeon/seunjeon/src/...?at=es-2.4.0.0</a></div>
<div>
<br /></div>
<div>
<h4>
설치</h4>
<div>
<div class="codehilite" style="background-color: white; color: #333333; font-family: arial, sans-serif; font-size: 14px; line-height: 20px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: consolas, menlo, "liberation mono", courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;">./bin/plugin install org.bitbucket.eunjeon/elasticsearch-analysis-seunjeon/2.4.0.0
</pre>
<div>
<br /></div>
<div>
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: consolas, menlo, "liberation mono", courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;">./bin/plugin install org.bitbucket.eunjeon/elasticsearch-analysis-seunjeon/2.3.5.0
</pre>
</div>
<div>
<br /></div>
<div>
</div>
</div>
</div>
</div>
</div>
유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-26526400068114449452016-05-22T20:16:00.001+09:002016-05-22T20:16:37.609+09:00elasticsearch-analysis-seunjeon 2.3.3.0 / 2.2.0.1 / 2.3.2.1 배포합니다.elasticsearch 2.3.3 용 플러그인 배포합니다. 버그수정도 있습니다.<br />
<br />
<h4>
수정내용:</h4>
<br />
<ul>
<li>복합명사가 길게 쓰여져있을 경우 하나의 UNK로 뽑히는 문제 해결. 예를 들어 "농어촌체험휴양하누리마을" 를 분석할 경우 그대로 "농어촌체험휴양하누리마을/UNK"로 분석 되었던 것을 "농어촌/체험/휴양/하누리/마을" 로 분석하게 수정. 관련 이슈: <a href="https://groups.google.com/forum/#!topic/eunjeon/eRZvjP-U69I">https://groups.google.com/forum/#!topic/eunjeon/eRZvjP-U69I</a></li>
<li>max_unk_length 옵션 추가. 설정할수있게 해두었지만 사용할일은 거의 없을 것 같습니다.</li>
</ul>
<br />
<div>
<h4>
소스 및 메뉴얼</h4>
<div>
</div>
</div>
<div>
<a href="https://bitbucket.org/eunjeon/seunjeon/src/ad2e2655ac940d2a6cc8d002c1dad1b5f807a01c/elasticsearch/?at=es-2.3.3.0">https://bitbucket.org/eunjeon/seunjeon/src/...?at=es-2.3.3.0</a></div>
<div>
<br /></div>
<div>
<h4>
설치</h4>
<div>
<div class="codehilite" style="background-color: white; color: #333333; font-family: arial, sans-serif; font-size: 14px; line-height: 20px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: consolas, menlo, 'liberation mono', courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;">./bin/plugin install org.bitbucket.eunjeon/elasticsearch-analysis-seunjeon/2.3.3.0
</pre>
<div>
<br /></div>
<div>
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: consolas, menlo, 'liberation mono', courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;">./bin/plugin install org.bitbucket.eunjeon/elasticsearch-analysis-seunjeon/2.3.2.1
</pre>
</div>
<div>
<br /></div>
<div>
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: consolas, menlo, 'liberation mono', courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;">./bin/plugin install org.bitbucket.eunjeon/elasticsearch-analysis-seunjeon/2.2.0.1
</pre>
</div>
<div>
<br /></div>
<div>
</div>
</div>
</div>
</div>
유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-20971958821021454262016-05-22T20:13:00.000+09:002016-05-22T20:13:46.528+09:00seunjeon-1.1.0 배포합니다.새로운 버그가 발견되어 새롭게 릴리즈합니다. 리포팅해주신 이윤희님께 감사드립니다. <a href="https://groups.google.com/forum/#!topic/eunjeon/eRZvjP-U69I">관련이슈</a><br />
<div>
<ul>
<li>소스 및 메뉴얼: <a href="https://bitbucket.org/eunjeon/seunjeon">https://bitbucket.org/eunjeon/seunjeon</a></li>
</ul>
<div>
<h2 id="markdown-header-maven" style="background-color: white; color: #333333; font-family: arial, sans-serif; font-size: 16px; line-height: 1.5625; margin: 30px 0px 0px; padding: 0px;">
Maven</h2>
<div class="codehilite" style="background-color: white; color: #333333; font-family: arial, sans-serif; font-size: 14px; line-height: 20px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: consolas, menlo, 'liberation mono', courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;"><span class="nt" style="color: navy;"><dependencies></span>
<span class="nt" style="color: navy;"><dependency></span>
<span class="nt" style="color: navy;"><groupId></span>org.bitbucket.eunjeon<span class="nt" style="color: navy;"></groupId></span>
<span class="nt" style="color: navy;"><artifactId></span>seunjeon_2.11<span class="nt" style="color: navy;"></artifactId></span>
<span class="nt" style="color: navy;"><version></span>1.1.0<span class="nt" style="color: navy;"></version></span>
<span class="nt" style="color: navy;"></dependency></span>
<span class="nt" style="color: navy;"></dependencies></span>
</pre>
</div>
<h2 id="markdown-header-sbt" style="background-color: white; color: #333333; font-family: arial, sans-serif; font-size: 16px; line-height: 1.5625; margin: 30px 0px 0px; padding: 0px;">
SBT</h2>
<div class="codehilite" style="background-color: white; color: #333333; font-family: arial, sans-serif; font-size: 14px; line-height: 20px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: consolas, menlo, 'liberation mono', courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;"><span class="n">libraryDependencies</span> <span class="o" style="font-weight: 700;">+=</span> <span class="s" style="color: #bb8844;">"org.bitbucket.eunjeon"</span> <span class="o" style="font-weight: 700;">%%</span> <span class="s" style="color: #bb8844;">"seunjeon"</span> <span class="o" style="font-weight: 700;">%</span> <span class="s" style="color: #bb8844;">"1.1.0"</span>
</pre>
</div>
<div>
<br /></div>
</div>
</div>
<div>
<h4>
변경내용</h4>
</div>
<br />
<div>
</div>
<br />
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: 'Apple SD Gothic Neo'; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 1; word-spacing: 0px;">
<ul>
<li>"농어촌체험휴양하누리마을" 와 같이 긴 음절의 미등록 복합명사가 분해되지 않는 문제 해결.</li>
<ul>
<li>Analyzer.setMaxUnkLength(length:Int) interface 추가</li>
</ul>
</ul>
</div>
유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-87683992300294898472016-05-07T00:33:00.000+09:002016-05-07T00:33:03.190+09:00elasticsearch-analysis-seunjeon 2.3.2.0 배포합니다.elasticsearch 2.3.2 용 플러그인 배포합니다. 변경사항은 없습니다.<div>
<br /></div>
<div>
<h4>
소스 및 메뉴얼</h4>
<div>
</div>
</div>
<div>
<a href="https://bitbucket.org/eunjeon/seunjeon/src/6e8a067fb9a12bcdcdd7f858fd84714c94835f04/elasticsearch?at=es-2.3.2.0">https://bitbucket.org/eunjeon/seunjeon/src/.../elasticsearch?at=es-2.3.2.0</a></div>
<div>
<br /></div>
<div>
<h4>
설치</h4>
<div>
<div class="codehilite" style="background-color: white; color: #333333; font-family: arial, sans-serif; font-size: 14px; line-height: 20px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: consolas, menlo, 'liberation mono', courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;">./bin/plugin install org.bitbucket.eunjeon/elasticsearch-analysis-seunjeon/2.3.2.0
</pre>
<div>
<br /></div>
</div>
</div>
<div>
</div>
</div>
유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-36183383714388909252016-04-26T21:09:00.002+09:002016-04-26T21:12:31.625+09:00elasticsearch-analysis-seunjeon 2.3.0.0 / elasticsearch-analysis-seunjeon 2.3.1.0 배포합니다elasticsearch 가 plugin하위 호환을 맞춰주지 않아 새로운 버전이 나올때마다 plugin도 새롭게 릴리즈를 해줘야 합니다. 사실 해주는거라곤 plugin-descriptor.properties 파알안에 elasticsearch.version=x.x.x 값만 바꿔주는 정도인데, 참 귀찮게 하네요.<br />
<div>
<br /></div>
<div>
혹시 은전한닢 버전이 늦게 나오더라도 파일 다운받으셔서 직접 plugins/analysis-seunjeon/ 디렉토리에 복사해주셔도 되고, 아니면 zip파일에서 plugin-descriptor.properties 파일을 수정해서 설치하여 사용하셔도 됩니다.</div>
<div>
<br /></div>
<div>
<h4>
소스 및 메뉴얼</h4>
<div>
<a href="https://bitbucket.org/eunjeon/seunjeon/src/a5c06e0d2017337c345add8d1b62650348436a57/elasticsearch?at=es-2.3.0.0">https://bitbucket.org/eunjeon/seunjeon/src/.../elasticsearch?at=es-2.3.0.0</a><br />
<a href="https://bitbucket.org/eunjeon/seunjeon/src/d028979cba037e420b83594a79c1e38b163b3d50/elasticsearch?at=es-2.3.1.0">https://bitbucket.org/eunjeon/seunjeon/src/.../elasticsearch?at=es-2.3.1.0</a></div>
<div>
<br /></div>
<h4>
설치</h4>
<div>
<div class="codehilite" style="background-color: white; color: #333333; font-family: arial, sans-serif; font-size: 14px; line-height: 20px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: consolas, menlo, 'liberation mono', courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;">./bin/plugin install org.bitbucket.eunjeon/elasticsearch-analysis-seunjeon/2.3.0.0
</pre>
<div>
<br /></div>
<div>
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: consolas, menlo, 'liberation mono', courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;">./bin/plugin install org.bitbucket.eunjeon/elasticsearch-analysis-seunjeon/2.3.1.0
</pre>
</div>
<div>
<br /></div>
<div>
</div>
</div>
</div>
</div>
유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-77151586926725035872016-04-06T01:13:00.000+09:002016-04-06T01:13:14.714+09:00mecab-ko-lucene-analyzer-0.20.1, elasticsesarch-analysis-mecab-ko-2.3.1.0 배포합니다.<span style="background-color: #d9ead3;">mecab-ko-lucene-analyzer-0.20.1, elasticsesarch-analysis-mecab-ko-2.3.1.0</span>이 나왔습니다.<br />
<br />
<b>새로운 기능</b><br />
<ul>
<li>Elasticsearch 2.3.1지원</li>
</ul>
<div>
mecab-ko-lucene-analyzer에는 변화가 없으므로, 업그레이드하지 않으셔도 됩니다.</div>
<div>
<div>
<br />
Lucene/Solr 용 분석기</div>
<div>
<ul>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/downloads/mecab-ko-lucene-analyzer-0.20.1.tar.gz" target="_blank">다운로드</a></li>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/src/2b345772535f081a3445da97d8738d03ed011ca0/?at=release-0.20.1" target="_blank">설<span id="goog_2082698774"></span><span id="goog_2082698775"></span>치방법</a></li>
</ul>
</div>
<div>
Elasticsearch 용 분석기</div>
<div>
<ul>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/downloads/elasticsearch-analysis-mecab-ko-2.3.1.0.zip" target="_blank">다운로드</a></li>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/src/2b345772535f081a3445da97d8738d03ed011ca0/elasticsearch-analysis-mecab-ko/?at=release-0.20.1" target="_blank">설치방법</a></li>
</ul>
</div>
</div>
이용운http://www.blogger.com/profile/11207329648297372888noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-6853585918997916252016-04-05T00:51:00.004+09:002016-04-05T00:51:49.717+09:00mecab-ko-lucene-analyzer-0.20.0, elasticsesarch-analysis-mecab-ko-2.3.0.0 배포합니다.<span style="background-color: #d9ead3;">mecab-ko-lucene-analyzer-0.20.0, elasticsesarch-analysis-mecab-ko-2.3.0.0</span>이 나왔습니다.<br />
<br />
<b>새로운 기능</b><br />
<ul>
<li>Elasticsearch 2.3.0지원</li>
</ul>
<div>
mecab-ko-lucene-analyzer에는 변화가 없으므로, 업그레이드하지 않으셔도 됩니다.</div>
<div>
<div>
<br />
Lucene/Solr 용 분석기</div>
<div>
<ul>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/downloads/mecab-ko-lucene-analyzer-0.20.0.tar.gz" target="_blank">다운로드</a></li>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/src/32818c92a9a517dad0cf4be636397a985eaf2c91/?at=release-0.20.0" target="_blank">설<span id="goog_2082698774"></span><span id="goog_2082698775"></span>치방법</a></li>
</ul>
</div>
<div>
Elasticsearch 용 분석기</div>
<div>
<ul>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/downloads/elasticsearch-analysis-mecab-ko-2.3.0.0.zip" target="_blank">다운로드</a></li>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/src/32818c92a9a517dad0cf4be636397a985eaf2c91/elasticsearch-analysis-mecab-ko/?at=release-0.20.0" target="_blank">설치방법</a></li>
</ul>
</div>
</div>
이용운http://www.blogger.com/profile/11207329648297372888noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-44948048080104497182016-04-05T00:49:00.001+09:002016-04-05T00:49:13.409+09:00mecab-ko-lucene-analyzer-0.19.2, elasticsesarch-analysis-mecab-ko-2.2.2.0 배포합니다.<span style="background-color: #d9ead3;">mecab-ko-lucene-analyzer-0.19.2, elasticsesarch-analysis-mecab-ko-2.2.2.0</span>이 나왔습니다.<br />
<br />
<b>새로운 기능</b><br />
<ul>
<li>Elasticsearch 2.2.2지원</li>
</ul>
<div>
mecab-ko-lucene-analyzer에는 변화가 없으므로, 업그레이드하지 않으셔도 됩니다.</div>
<div>
<div>
<br />
Lucene/Solr 용 분석기</div>
<div>
<ul>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/downloads/mecab-ko-lucene-analyzer-0.19.2.tar.gz" target="_blank">다운로드</a></li>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/src/64373d298a8fbe7240d7ebe1d6788d9bd2b8de4c/?at=release-0.19.2" target="_blank">설<span id="goog_2082698774"></span><span id="goog_2082698775"></span>치방법</a></li>
</ul>
</div>
<div>
Elasticsearch 용 분석기</div>
<div>
<ul>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/downloads/elasticsearch-analysis-mecab-ko-2.2.2.0.zip" target="_blank">다운로드</a></li>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/src/64373d298a8fbe7240d7ebe1d6788d9bd2b8de4c/elasticsearch-analysis-mecab-ko/?at=release-0.19.2" target="_blank">설치방법</a></li>
</ul>
</div>
</div>
이용운http://www.blogger.com/profile/11207329648297372888noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-45945542209326218762016-04-05T00:46:00.001+09:002016-04-05T00:46:36.175+09:00mecab-ko-lucene-analyzer-0.19.1, elasticsesarch-analysis-mecab-ko-2.2.1.0 배포합니다.<span style="background-color: #d9ead3;">mecab-ko-lucene-analyzer-0.19.1, elasticsesarch-analysis-mecab-ko-2.2.1.0</span>이 나왔습니다.<br />
<br />
<b>새로운 기능</b><br />
<ul>
<li>Elasticsearch 2.2.1 지원</li>
</ul>
<div>
mecab-ko-lucene-analyzer에는 변화가 없으므로, 업그레이드하지 않으셔도 됩니다.<br />
<br />
패치를 제공해주신 'MooYoul Lee'님 감사드립니다.</div>
<div>
<div>
<br />
Lucene/Solr 용 분석기</div>
<div>
<ul>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/downloads/mecab-ko-lucene-analyzer-0.19.1.tar.gz" target="_blank">다운로드</a></li>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/src/3009c4d6ed18c2f9025247666c4befd0754ccb5f/?at=v0.19.1" target="_blank">설<span id="goog_2082698774"></span><span id="goog_2082698775"></span>치방법</a></li>
</ul>
</div>
<div>
Elasticsearch 용 분석기</div>
<div>
<ul>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/downloads/elasticsearch-analysis-mecab-ko-2.2.1.0.zip" target="_blank">다운로드</a></li>
<li><a href="https://bitbucket.org/eunjeon/mecab-ko-lucene-analyzer/src/3009c4d6ed18c2f9025247666c4befd0754ccb5f/elasticsearch-analysis-mecab-ko/?at=v0.19.1" target="_blank">설치방법</a></li>
</ul>
</div>
</div>
이용운http://www.blogger.com/profile/11207329648297372888noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-70118998272013333572016-02-24T22:45:00.000+09:002016-02-24T22:45:06.002+09:00elasticsearch-analysis-seunjeon 2.2.0.0 배포합니다.elasticsearch 2.2.0 버전용 플러그인 릴리즈합니다. elasticsearch 버전별로 플러그인도 릴리즈해줘야 하는 번거로움이 있네요. 조금 대응이 늦더라도 이해부탁드립니다.<br />
<br />
기능적으로 새롭게 추가한 것은 없습니다.<br />
다만 기존에는 내부적으로 사전 탐색에 <a href="https://github.com/takawitter/trie4j">https://github.com/takawitter/trie4j</a> 를 사용하고 있었는데, trie4j가 값 저장에 boxed Integer를 사용하고 있어 성능상 아쉬움이 있었습니다. 그래서 double array trie를 직접 구현하게 되었습니다. 성능이 눈에 띄게 좋아진건 아니지만 다소 개선되었습니다. 버그가 보이면 주저하지마시고 빠르게 알려주세요 ^^<br />
<br />
<h4>
소스 및 메뉴얼</h4>
<div>
<a href="https://bitbucket.org/eunjeon/seunjeon/src/cda91713254e0d89c90be49a87f1a38ca4ed36fd/elasticsearch?at=es-2.2.0.0">https://bitbucket.org/eunjeon/seunjeon/src/.../elasticsearch?at=es-2.2.0.0</a></div>
<div>
<br /></div>
<h4>
설치</h4>
<div>
<div class="codehilite" style="background-color: white; color: #333333; font-family: arial, sans-serif; font-size: 14px; line-height: 20px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: consolas, menlo, 'liberation mono', courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;">./bin/plugin install org.bitbucket.eunjeon/elasticsearch-analysis-seunjeon/2.2.0.0
</pre>
<div>
<br /></div>
</div>
</div>
<div>
<br /></div>
유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-75514845697747677202016-02-20T19:13:00.003+09:002016-02-20T19:13:25.660+09:00elasticsearch-analysis-seunjeon 2.1.1.3 배포합니다.버그 fix 버전 새롭게 릴리즈합니다. 버그리포팅 및 의견 주신분들 감사합니다.<div>
<br /></div>
<h4>
소스 및 메뉴얼</h4>
<div>
<a href="https://bitbucket.org/eunjeon/seunjeon/src/d99f126d93dd2e432da77c588788d63028947d85/elasticsearch?at=es-2.1.1.3">https://bitbucket.org/eunjeon/seunjeon/src/.../elasticsearch?at=es-2.1.1.3</a></div>
<div>
<br /></div>
<h4>
설치</h4>
<div>
<div class="codehilite" style="background-color: white; color: #333333; font-family: Arial, sans-serif; font-size: 14px; line-height: 20px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: Consolas, Menlo, 'Liberation Mono', Courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;">./bin/plugin install org.bitbucket.eunjeon/elasticsearch-analysis-seunjeon/2.1.1.3
</pre>
<div>
<br /></div>
</div>
</div>
<div>
<br /></div>
<h4>
변경내용</h4>
<div>
<ul>
<li>활용어 원형 추출시 offet 오류로 exception 나는 문제 해결</li>
<li>영문 소문자로 변환해서 분석. 영어 단어는 소문자로 관리하면 됨</li>
</ul>
</div>
유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-79191248653556352912016-02-20T19:09:00.001+09:002016-02-20T19:09:23.570+09:00seunjeon-1.0.4 배포합니다.버그 fix 버전 새롭게 릴리즈합니다. 요즘은 버그때문에 너무 자주 릴리즈하는 것 같네요.<div>
<br /></div>
<div>
<ul>
<li>소스 및 메뉴얼: <a href="https://bitbucket.org/eunjeon/seunjeon">https://bitbucket.org/eunjeon/seunjeon</a></li>
</ul>
<div>
<h2 id="markdown-header-maven" style="background-color: white; color: #333333; font-family: Arial, sans-serif; font-size: 16px; line-height: 1.5625; margin: 30px 0px 0px; padding: 0px;">
Maven</h2>
<div class="codehilite" style="background-color: white; color: #333333; font-family: Arial, sans-serif; font-size: 14px; line-height: 20px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: Consolas, Menlo, 'Liberation Mono', Courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;"><span class="nt" style="color: navy;"><dependencies></span>
<span class="nt" style="color: navy;"><dependency></span>
<span class="nt" style="color: navy;"><groupId></span>org.bitbucket.eunjeon<span class="nt" style="color: navy;"></groupId></span>
<span class="nt" style="color: navy;"><artifactId></span>seunjeon_2.11<span class="nt" style="color: navy;"></artifactId></span>
<span class="nt" style="color: navy;"><version></span>1.0.4<span class="nt" style="color: navy;"></version></span>
<span class="nt" style="color: navy;"></dependency></span>
<span class="nt" style="color: navy;"></dependencies></span>
</pre>
</div>
<h2 id="markdown-header-sbt" style="background-color: white; color: #333333; font-family: Arial, sans-serif; font-size: 16px; line-height: 1.5625; margin: 30px 0px 0px; padding: 0px;">
SBT</h2>
<div class="codehilite" style="background-color: white; color: #333333; font-family: Arial, sans-serif; font-size: 14px; line-height: 20px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: Consolas, Menlo, 'Liberation Mono', Courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;"><span class="n">libraryDependencies</span> <span class="o" style="font-weight: 700;">+=</span> <span class="s" style="color: #bb8844;">"org.bitbucket.eunjeon"</span> <span class="o" style="font-weight: 700;">%%</span> <span class="s" style="color: #bb8844;">"seunjeon"</span> <span class="o" style="font-weight: 700;">%</span> <span class="s" style="color: #bb8844;">"1.0.4"</span>
</pre>
</div>
<div>
<span style="color: #333333; font-family: Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;"><br /></span></span></div>
</div>
</div>
<div>
<span style="color: #333333; font-family: Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;"><br /></span></span></div>
<div>
변경내용</div>
<div>
<ul>
<li>deinflect 시 offset 버그 해결</li>
</ul>
</div>
<div>
<br /></div>
유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com0tag:blogger.com,1999:blog-901903469323947302.post-23010800694352081932016-02-08T22:50:00.001+09:002016-02-20T19:13:27.643+09:00elasticsearch-analysis-seunjeon 2.1.1.2 배포합니다.<a href="http://eunjeon.blogspot.kr/2016/02/seunjeon-103.html">seunjeon 1.0.3 패치</a>내용 적용하여 새로 릴리즈합니다.<br />
<br />
흔히 발생할 수 있는 버그에 대한 패치입니다. 형태소분석 자체가 안되는 경우가 있었으니 모두 업그레이드 하여 사용하세요~<br />
<br />
<h4>
소스 및 메뉴얼</h4>
<a href="https://bitbucket.org/eunjeon/seunjeon/src/c67395dfd6f272b30459b203f1591baf8e504c72/elasticsearch/?at=es-2.1.1.2">https://bitbucket.org/eunjeon/seunjeon/src/.../elasticsearch/?at=es-2.1.1.2</a><br />
<br />
<h4>
설치</h4>
<div>
</div>
<br />
<div class="codehilite" style="background-color: white; color: #333333; font-family: arial, sans-serif; font-size: 14px; line-height: 20px; margin: 10px 0px 0px; padding: 0px;">
<pre style="background: rgb(245, 245, 245); border-radius: 3px; border: 1px solid rgb(204, 204, 204); font-family: consolas, menlo, 'liberation mono', courier, monospace; font-size: 12px; line-height: 1.4; overflow-x: auto; padding: 5px 10px; word-wrap: normal;">./bin/plugin install org.bitbucket.eunjeon/elasticsearch-analysis-seunjeon/2.1.1.2
</pre>
<div>
<br /></div>
</div>
<h4>
변경내용</h4>
<ul>
<li>특정 키워드가 들어간 문장에서 분석이 다 되지 않는 문제 해결</li>
</ul>
유영호http://www.blogger.com/profile/16742674488825377014noreply@blogger.com0