分析コードのまとめ -指示語の累積出現頻度- - S-Linguistics

分析コードのまとめ -指示語の累積出現頻度-

投稿者: Sho オン 05/07/2024 26/06/2024 BLOG 日本語

分析コードをまとめていく．指示語の累積出現頻度編

前回に続いて，コードをまとめる．

今回はテキスト内の指示語の累積出現頻度にフォーカスをして分析をする．

import numpy as np

def plot_cumulative_deictic_word_frequency(text, deictic_words):
    """
    指示語の累積出現頻度をプロットする関数
    """
    tokens = nltk.word_tokenize(text)
    cumulative_counts = {word: np.zeros(len(tokens)) for word in deictic_words}

    for i, token in enumerate(tokens):
        for word in deictic_words:
            if token == word:
                cumulative_counts[word][i] = 1

    plt.figure(figsize=(12, 6))
    for word, counts in cumulative_counts.items():
        cumulative_counts[word] = np.cumsum(counts)
        plt.plot(cumulative_counts[word], label=word)

    plt.xlabel('Word Index')
    plt.ylabel('Cumulative Frequency')
    plt.title('Cumulative Frequency of Deictic Words')
    plt.legend()
    plt.show()

# 指示語の累積出現頻度のプロット
plot_cumulative_deictic_word_frequency(processed_text, deictic_words)

関連

コメントを残すコメントをキャンセル