象征化后的句子测序不正确

发布于 2025-01-31 05:39:17 字数 15781 浏览 2 评论 0原文

我正在使用这些数据 在这里 称为女子电子企业E-Commerce Chiew,通过应用这个想法来将功能合并到文本中,来自这里用于NLP学习。

tokenizer = Tokenizer(num_words=60000)
tokenizer.fit_on_texts(text_train)
sequences_train = tokenizer.texts_to_sequences(text_train) #text_train - training set containing text feature only
sequences_test = tokenizer.texts_to_sequences(text_test)
word_index = tokenizer.word_index
print("\nWord Index = ", word_index)

Word Index =  {'i': 1, 'the': 2, 'and': 3, 'this': 4, 'is': 5, 'item': 6, 'a': 7, 'it': 8, 'of': 9, '5': 10, 'am': 11, 'out': 12, 'to': 13, 'from': 14, 'under': 15, 'years': 16, 'comes': 17, 'old': 18, 'stars': 19, 'rate': 20, 'department': 21, 'division': 22, 'classified': 23, 'in': 24, 'general': 25, 'but': 26, 'on': 27, 'for': 28, 'with': 29, 'so': 30, 'dresses': 31, 'was': 32, 'dress': 33, 'my': 34, 'not': 35, 'love': 36, 'tops': 37, 'petite': 38, 'that': 39, 'size': 40, 'very': 41, 'top': 42, 'have': 43, 'fit': 44, 'great': 45, 'like': 46, 'are': 47, 'be': 48, 'me': 49, 'as': 50, 'too': 51, 'wear': 52, "it's": 53, '4': 54, 'or': 55, 'just': 56, "i'm": 57, 'you': 58, 'fabric': 59, 'small': 60, 'would': 61, 'up': 62, 'they': 63, 'color': 64, 'at': 65, 'cute': 66, 'perfect': 67, 'knits': 68, 'beautiful': 69, 'look': 70, 'really': 71, 'if': 72, 'flattering': 73, 'more': 74, 'little': 75, 'these': 76, 'ordered': 77, 'bottoms': 78, 'soft': 79, 'jeans': 80, 'one': 81, 'comfortable': 82, 'will': 83, 'nice': 84, 'pants': 85, 'well': 86, 'an': 87, 'back': 88, '\r': 89, '3': 90, 'because': 91, 'can': 92, 'had': 93, 'shirt': 94, 'than': 95, 'large': 96, 'all': 97, 'blouses': 98, 'bought': 99, 'looks': 100, 'bit': 101, 'fits': 102, 'sweater': 103, 'down': 104, 'when': 105, 'pretty': 106, '2': 107, 'much': 108, 'material': 109, 'which': 110, 'them': 111, 'length': 112, 'long': 113, 'also': 114, 'has': 115, 'quality': 116, 'colors': 117, 'waist': 118, 'got': 119, 'skirt': 120, 'xs': 121, 'work': 122, 'medium': 123, 'even': 124, 'think': 125, 'good': 126, 'retailer': 127, 'tried': 128, 'runs': 129, 'short': 130, 'big': 131, 'summer': 132, 'store': 133, 'super': 134, 'made': 135, 'other': 136, 'about': 137, 'usually': 138, 'way': 139, 'cut': 140, 'get': 141, 'could': 142, 'only': 143, 'black': 144, 'style': 145, 'see': 146, "don't": 147, "didn't": 148, 'right': 149, 'there': 150, 'were': 151, 'jackets': 152, 'still': 153, 'design': 154, 'no': 155, 'true': 156, 'fine': 157, 'did': 158, 'sleeves': 159, 'online': 160, 'white': 161, 'do': 162, 'intimate': 163, 'go': 164, 'lovely': 165, 'sweaters': 166, 'wearing': 167, 'off': 168, 'gorgeous': 169, 'purchased': 170, 'tight': 171, 'perfectly': 172, 'however': 173, 'does': 174, 'enough': 175, 'feel': 176, 'some': 177, 'fall': 178, 'better': 179, 'front': 180, 'person': 181, 'over': 182, 'initmates': 183, 'model': 184, 'definitely': 185, 'what': 186, '39': 187, 'looked': 188, 'been': 189, 'blue': 190, 'jacket': 191, 'though': 192, 'comfy': 193, 'price': 194, 'sale': 195, 'loved': 196, 'lbs': 197, 'how': 198, 'body': 199, 'casual': 200, 'your': 201, 'loose': 202, 'piece': 203, 'wanted': 204, 'high': 205, 'bottom': 206, 'light': 207, 'first': 208, 'try': 209, 'going': 210, 'blouse': 211, 'around': 212, 'skirts': 213, 'shape': 214, 'looking': 215, 'time': 216, 'regular': 217, 'worn': 218, 'many': 219, 'thin': 220, '6': 221, 'chest': 222, 'both': 223, 'make': 224, 'thought': 225, 'through': 226, 'arms': 227, '1': 228, 'pattern': 229, 'fun': 230, 's': 231, 'shoulders': 232, 'saw': 233, 'print': 234, 'makes': 235, 'recommend': 236, 'after': 237, 'gauge': 238, 'bust': 239, 'unique': 240, 'compliments': 241, "doesn't": 242, 'need': 243, 'without': 244, 'want': 245, 'being': 246, 'weight': 247, 'pair': 248, 'wish': 249, 'buy': 250, 'sure': 251, 'wore': 252, 'hips': 253, 'find': 254, "i've": 255, 'side': 256, 'run': 257, 'found': 258, 'went': 259, 'reviews': 260, 'bra': 261, "can't": 262, 'different': 263, '34': 264, 'easy': 265, 'order': 266, 'usual': 267, 'shorts': 268, 'picture': 269, '35': 270, 'since': 271, 'lot': 272, 'by': 273, '36': 274, '8': 275, 'tee': 276, '38': 277, 'longer': 278, 'received': 279, 'detail': 280, 'quite': 281, 'versatile': 282, 'day': 283, 'while': 284, 'its': 285, 'leggings': 286, 'most': 287, 'any': 288, 'absolutely': 289, 'boxy': 290, 'nicely': 291, 'normally': 292, 'm': 293, 'two': 294, 'adorable': 295, 'overall': 296, 'another': 297, 'slightly': 298, 'then': 299, 'tank': 300, 'keep': 301, 'green': 302, 'may': 303, '26': 304, 'wide': 305, 'fitted': 306, 'lace': 307, 'return': 308, 'sizing': 309, 'red': 310, 'felt': 311, 'stretch': 312, '33': 313, 'underneath': 314, 'neck': 315, 'might': 316, '37': 317, 'warm': 318, 'say': 319, 't': 320, 'now': 321, 'happy': 322, 'almost': 323, 'flowy': 324, "wasn't": 325, 'sheer': 326, 'lounge': 327, 'favorite': 328, '10': 329, 'low': 330, 'something': 331, '46': 332, 'photo': 333, 'sizes': 334, 'actually': 335, 'probably': 336, 'extra': 337, '40': 338, '41': 339, 'amazing': 340, 'huge': 341, 'pockets': 342, '32': 343, 'worth': 344, 'every': 345, 'spring': 346, 'purchase': 347, 'area': 348, 'yet': 349, 'such': 350, '29': 351, 'disappointed': 352, '28': 353, 'feminine': 354, 'dressed': 355, 'full': 356, 'tall': 357, "5'4": 358, 'best': 359, 'feels': 360, 'smaller': 361, 'into': 362, 'boots': 363, 'put': 364, 'shorter': 365, '42': 366, 'details': 367, 'said': 368, 'same': 369, '48': 370, 'unfortunately': 371, 'denim': 372, 'everything': 373, 'buttons': 374, 'between': 375, 'winter': 376, '30': 377, '27': 378, 'tunic': 379, 'tts': 380, 'glad': 381, '44': 382, 'navy': 383, 'sized': 384, 'reference': 385, 'neckline': 386, 'skinny': 387, 'always': 388, 'decided': 389, 'pictured': 390, 'give': 391, 'line': 392, '31': 393, 'liked': 394, 'larger': 395, 'maybe': 396, '43': 397, 'should': 398, 'coat': 399, 'thick': 400, 'above': 401, '53': 402, 'lightweight': 403, 'easily': 404, 'wait': 405, 'snug': 406, 'wash': 407, 'part': 408, 'figure': 409, 'fitting': 410, 'cozy': 411, "i'd": 412, 'new': 413, 'take': 414, 'pink': 415, 'problem': 416, 'where': 417, 'especially': 418, 'button': 419, 'returned': 420, 'thing': 421, 'came': 422, 'expected': 423, 'grey': 424, "isn't": 425, '47': 426, 'room': 427, 'seems': 428, 'reviewers': 429, 'classic': 430, 'heavy': 431, '45': 432, 'cardigan': 433, 'cotton': 434, '25': 435, 'slip': 436, 'hem': 437, "5'3": 438, 'l': 439, 'kind': 440, 'normal': 441, 'few': 442, 'know': 443, 'beautifully': 444, '49': 445, "couldn't": 446, 'knit': 447, 'stylish': 448, 'cool': 449, 'curvy': 450, 'lining': 451, 'knee': 452, 'dark': 453, 'embroidery': 454, 'xl': 455, '0': 456, "that's": 457, '56': 458, 'belt': 459, 'never': 460, 'took': 461, 'cami': 462, '12': 463, 'frame': 464, 'before': 465, 'although': 466, 'bad': 467, 'hit': 468, 'lined': 469, "i'll": 470, 'torso': 471, 'typically': 472, 'goes': 473, 'extremely': 474, 'washed': 475, 'able': 476, 'arm': 477, "5'5": 478, 'excited': 479, 'ever': 480, "5'2": 481, 'works': 482, 'who': 483, 'vest': 484, 'legs': 485, 'reviewer': 486, 'hits': 487, 'stunning': 488, '50': 489, 'returning': 490, 'someone': 491, 'once': 492, 'highly': 493, 'show': 494, 'anything': 495, 'xxs': 496, 'those': 497, 'weather': 498, 'simple': 499, 'either': 500, 'type': 501, 'wardrobe': 502, '52': 503, 'suit': 504, 'unflattering': 505, 'shoulder': 506, 'zipper': 507, 'photos': 508, 'issue': 509, 'ended': 510, "5'7": 511, 'year': 512, 'today': 513, 'things': 514, 'needed': 515, 'stretchy': 516, 'exactly': 517, 'gray': 518, 'staple': 519, 'sleeve': 520, 'sold': 521, 'why': 522, '51': 523, 'tie': 524, 'swing': 525, 'layer': 526, "you're": 527, 'hot': 528, 'shown': 529, 'trying': 530, 'already': 531, 'fell': 532, 'below': 533, 'days': 534, 'arrived': 535, 'wonderful': 536, 'swim': 537, 'last': 538, 'layering': 539, 'brand': 540, 'must': 541, 'orange': 542, '60': 543, 'agree': 544, 'itchy': 545, "5'6": 546, 'trend': 547, 'add': 548, 'slim': 549, 'nothing': 550, 'basic': 551, 'skin': 552, 'otherwise': 553, 'away': 554, 'tad': 555, "wouldn't": 556, 'straps': 557, 'cheap': 558, 'come': 559, 'holes': 560, 'getting': 561, 'again': 562, 'sadly': 563, 'baggy': 564, 'due': 565, '54': 566, 'odd': 567, '57': 568, 'local': 569, 'hoping': 570, 'dry': 571, 'delicate': 572, "they're": 573, 'sides': 574, 'hard': 575, 'across': 576, 'vibrant': 577, 'ran': 578, "5'8": 579, '62': 580, 'inches': 581, 'outerwear': 582, 'weird': 583, 'others': 584, 'pilcro': 585, 'cropped': 586, 'plus': 587, 'several': 588, 'rather': 589, 'waisted': 590, 'elegant': 591, 'bright': 592, 'awesome': 593, 'surprised': 594, 'busty': 595, 'ordering': 596, 'special': 597, 'heels': 598, 'less': 599, 'seem': 600, 'keeping': 601, 'drape': 602, 'straight': 603, "5'": 604, 'roomy': 605, 'open': 606, 'version': 607, 'hangs': 608, 'least': 609, 'use': 610, 'yellow': 611, 'kept': 612, 'touch': 613, '55': 614, 'night': 615, '66': 616, 'pictures': 617, 'pull': 618, 'plan': 619, 'based': 620, 'pounds': 621, 'ok': 622, 'own': 623, 'oversized': 624, 'drapes': 625, 'seemed': 626, 'v': 627, 'sometimes': 628, 'season': 629, 'ankle': 630, 'paired': 631, 'review': 632, 'tights': 633, 'shirts': 634, 'far': 635, '24': 636, 'elastic': 637, 'jumpsuit': 638, '59': 639, 'lots': 640, 'their': 641, 'bigger': 642, 'worked': 643, 'myself': 644, 'totally': 645, 'flare': 646, 'available': 647, 'having': 648, 'flat': 649, 'wedding': 650, 'jean': 651, 'here': 652, 'gives': 653, 'she': 654, 'booties': 655, '58': 656, 'sexy': 657, 'making': 658, 'immediately': 659, 'poor': 660, 'mentioned': 661, "5'1": 662, 'mine': 663, 'cover': 664, 'completely': 665, 'falls': 666, '115': 667, 'collar': 668, 'times': 669, 'interesting': 670, 'wool': 671, 'end': 672, 'sleep': 673, 'instead': 674, 'stiff': 675, 'tiny': 676, 'higher': 677, 'leg': 678, 'ivory': 679, 'cold': 680, "won't": 681, 'finally': 682, 'classy': 683, 'neutral': 684, 'pant': 685, 'yes': 686, 'mid': 687, 'previous': 688, "there's": 689, 'dressy': 690, 'throw': 691, 'shows': 692, 'athletic': 693, 'slimming': 694, 'wrong': 695, 'lower': 696, 'people': 697, 'read': 698, 'buying': 699, 'thinking': 700, 'substantial': 701, 'adds': 702, 'sandals': 703, 'texture': 704, '23': 705, 'cream': 706, 'used': 707, 'form': 708, 'knees': 709, 'incredibly': 710, 'chic': 711, "5'9": 712, 'eye': 713, 'fact': 714, 'clothes': 715, 'months': 716, 'similar': 717, '64': 718, 'fabulous': 719, 'curves': 720, 'addition': 721, 'half': 722, 'close': 723, 'knew': 724, 'armholes': 725, 'closet': 726, 'product': 727, 'bulky': 728, 'else': 729, 'idea': 730, 'outfit': 731, 'maternity': 732, 'stripes': 733, 'inside': 734, 'hourglass': 735, 'broad': 736, 'linen': 737, 'prefer': 738, 'worried': 739, 'gotten': 740, 'detailing': 741, 'rich': 742, 'often': 743, 'hip': 744, 'home': 745, 'clothing': 746, 'brown': 747, '65': 748, 'reason': 749, 'flats': 750, 'sweatshirt': 751, '120': 752, 'subtle': 753, 'second': 754, 'silk': 755, 'peplum': 756, 'baby': 757, 'rise': 758, 'amount': 759, '20': 760, 'depending': 761, 'expect': 762, 'deep': 763, 'next': 764, 'during': 765, 'appropriate': 766, '61': 767, 'comfort': 768, 'tent': 769, 'gave': 770, 'c': 771, 'fantastic': 772, 'except': 773, 'until': 774, 'floral': 775, '63': 776, 'mind': 777, 'mail': 778, 'strange': 779, 'b': 780, 'appears': 781, 'flow': 782, 'couple': 783, 'sad': 784, 'blazer': 785, 'romper': 786, 'hope': 787, 'w': 788, 'purple': 789, 'soon': 790, 'thicker': 791, 'left': 792, 'tell': 793, 'upper': 794, 'washing': 795, 'hang': 796, 'awkward': 797, 'fan': 798, 'point': 799, '34c': 800, 'complaint': 801, 'together': 802, 'issues': 803, 'pleased': 804, 'colored': 805, 'maeve': 806, 'disappointing': 807, 'along': 808, 'thighs': 809, 'anyway': 810, 'hold': 811, '34b': 812, 'perhaps': 813, 'nude': 814, 'taller': 815, 'hand': 816, 'her': 817, '125': 818, 'looser': 819, 'build': 820, 'live': 821, 'deal': 822, 'gone': 823, 'added': 824, '135': 825, 'forward': 826, 'justice': 827, '140': 828, 'exchange': 829, 'slight': 830, 'somewhat': 831, 'maxi': 832, 'everywhere': 833, 'real': 834, 'tag': 835, 'needs': 836, 'cannot': 837, 'fairly': 838, 'please': 839, 'inch': 840, 'case': 841, 'wow': 842, 'hung': 843, 'sweet': 844, 'truly': 845, 'beach': 846, 'feeling': 847, 'place': 848, 'girls': 849, 'sort': 850, 'middle': 851, "haven't": 852, 'bodice': 853, 'seam': 854, 'motif': 855, 'spot': 856, 'flows': 857, 'past': 858, 'pulled': 859, 'seams': 860, 'whole': 861, 'coverage': 862, 'butt': 863, 'leather': 864, 'meant': 865, 'expecting': 866, 'excellent': 867, 'simply': 868, 'women': 869, 'party': 870, 'office': 871, 'pieces': 872, 'transition': 873, 'narrow': 874, 'scratchy': 875, 'opinion': 876, 'we': 877, 'pairs': 878, 'airy': 879, 'items': 880, 'heavier': 881, 'tummy': 882, 'waistband': 883, 'says': 884, 'plaid': 885, 'itself': 886, 'zip': 887, 'weekend': 888, 'band': 889, 'typical': 890, 'done': 891, 'girl': 892, 'generally': 893, 'styling': 894, 'stitching': 895, 'slender': 896, 'barely': 897, 'sewn': 898, 'relaxed': 899, 'imagine': 900, '130': 901, 'three': 902, 'reading': 903, '100': 904, 'legwear': 905, 'seen': 906, 'stay': 907, 'unless': 908, 'noticed': 909, 'darker': 910, 'dinner': 911, 'difficult': 912, 'bag': 913, 'chested': 914, 'each': 915, 'belly': 916, '14': 917, 'lay': 918, 'everyday': 919, 'ago': 920, 'gold': 921, 'oh': 922, 'showing': 923, 'clean': 924, 'ladies': 925, 'husband': 926, 'swingy': 927, 'everyone': 928, 'pregnant': 929, 'lines': 930, 'portion': 931, 'sack': 932, 'clingy': 933, 'wider': 934, 'anyone': 935, 'shapeless': 936, 'ag': 937, '67': 938, 'believe': 939, 'prettier': 940, 'surprise': 941, '34d': 942, 'frumpy': 943, 'places': 944, 'seasons': 945, 'slits': 946, 'necklace': 947, 'running': 948, "5'10": 949, 'care': 950, 'tailored': 951, 'okay': 952, 'chance': 953, 'cuter': 954, 'lighter': 955, 'perfection': 956, 'keeper': 957, 'shade': 958, 'shaped': 959, 'petites': 960, 'rest': 961, 'coral': 962, 'beauty': 963, 'finding': 964, 'caught': 965, 'money': 966, 'intimates': 967, 'tucked': 968, 'help': 969, 'thigh': 970, 'romantic': 971, 'hanging': 972, 'cup': 973, 'pleats': 974, 'stomach': 975, 'height': 976, 'silhouette': 977, "aren't": 978, 'solid': 979, 'polyester': 980, 'expensive': 981, 'dressing': 982, 'guess': 983, 'p': 984, 'silky': 985, 'plenty': 986, 'd': 987, 'flowing': 988, 'description': 989, 'patterns': 990, 'purchasing': 991, 'layers': 992, 'structured': 993, 'life': 994, 'poncho': 995, 'pear': 996, 'beige': 997, 'uncomfortable': 998, 'vintage': 999, 'camisole': 1000, ...,'bored': 13489} # 1001-13488 has been removed due to word limit

print("The encoding for document\n",text_train[1356],"\n is : ",sequences_train[1356])

The encoding for document
 This item comes from the Bottoms department and General division, and is classified under Pants. I am 36 years old. I rate this item 5 out of 5 stars. Lovely. These trousers are wonderful. the fabric is comfortable and does have a little give to it. 
 is :  [4, 6, 17, 14, 2, 78, 21, 3, 25, 22, 3, 5, 23, 15, 85, 1, 11, 445, 16, 18, 1, 20, 4, 6, 54, 12, 9, 10, 19, 36, 76, 1, 93, 189, 1410, 76, 28, 7, 284, 26, 50, 63, 148, 600, 41, 3478, 1, 1314, 412, 405, 28, 7, 195, 1, 682, 99, 111, 65, 2, 338, 168, 9, 195, 194, 2135, 3, 1, 36, 111, 573, 474, 82, 3, 41, 73, 412, 319, 63, 257, 156, 13, 40, 26, 396, 27, 2, 361, 672, 9, 2, 40, 28, 2, 118, 2, 190, 85, 47, 74, 2528, 24, 64, 95, 2, 269, 692, 26, 153, 69, 57, 215, 826, 13, 167, 76, 321, 3, 362, 2, 346]

但是,当我检查此特定句子时,编码是不正确的。我做对了什么?多谢。

I'm using this data from
here called Women's E-Commerce Clothing Review by applying the idea of combining features into text from here for NLP learning.

tokenizer = Tokenizer(num_words=60000)
tokenizer.fit_on_texts(text_train)
sequences_train = tokenizer.texts_to_sequences(text_train) #text_train - training set containing text feature only
sequences_test = tokenizer.texts_to_sequences(text_test)
word_index = tokenizer.word_index
print("\nWord Index = ", word_index)

Word Index =  {'i': 1, 'the': 2, 'and': 3, 'this': 4, 'is': 5, 'item': 6, 'a': 7, 'it': 8, 'of': 9, '5': 10, 'am': 11, 'out': 12, 'to': 13, 'from': 14, 'under': 15, 'years': 16, 'comes': 17, 'old': 18, 'stars': 19, 'rate': 20, 'department': 21, 'division': 22, 'classified': 23, 'in': 24, 'general': 25, 'but': 26, 'on': 27, 'for': 28, 'with': 29, 'so': 30, 'dresses': 31, 'was': 32, 'dress': 33, 'my': 34, 'not': 35, 'love': 36, 'tops': 37, 'petite': 38, 'that': 39, 'size': 40, 'very': 41, 'top': 42, 'have': 43, 'fit': 44, 'great': 45, 'like': 46, 'are': 47, 'be': 48, 'me': 49, 'as': 50, 'too': 51, 'wear': 52, "it's": 53, '4': 54, 'or': 55, 'just': 56, "i'm": 57, 'you': 58, 'fabric': 59, 'small': 60, 'would': 61, 'up': 62, 'they': 63, 'color': 64, 'at': 65, 'cute': 66, 'perfect': 67, 'knits': 68, 'beautiful': 69, 'look': 70, 'really': 71, 'if': 72, 'flattering': 73, 'more': 74, 'little': 75, 'these': 76, 'ordered': 77, 'bottoms': 78, 'soft': 79, 'jeans': 80, 'one': 81, 'comfortable': 82, 'will': 83, 'nice': 84, 'pants': 85, 'well': 86, 'an': 87, 'back': 88, '\r': 89, '3': 90, 'because': 91, 'can': 92, 'had': 93, 'shirt': 94, 'than': 95, 'large': 96, 'all': 97, 'blouses': 98, 'bought': 99, 'looks': 100, 'bit': 101, 'fits': 102, 'sweater': 103, 'down': 104, 'when': 105, 'pretty': 106, '2': 107, 'much': 108, 'material': 109, 'which': 110, 'them': 111, 'length': 112, 'long': 113, 'also': 114, 'has': 115, 'quality': 116, 'colors': 117, 'waist': 118, 'got': 119, 'skirt': 120, 'xs': 121, 'work': 122, 'medium': 123, 'even': 124, 'think': 125, 'good': 126, 'retailer': 127, 'tried': 128, 'runs': 129, 'short': 130, 'big': 131, 'summer': 132, 'store': 133, 'super': 134, 'made': 135, 'other': 136, 'about': 137, 'usually': 138, 'way': 139, 'cut': 140, 'get': 141, 'could': 142, 'only': 143, 'black': 144, 'style': 145, 'see': 146, "don't": 147, "didn't": 148, 'right': 149, 'there': 150, 'were': 151, 'jackets': 152, 'still': 153, 'design': 154, 'no': 155, 'true': 156, 'fine': 157, 'did': 158, 'sleeves': 159, 'online': 160, 'white': 161, 'do': 162, 'intimate': 163, 'go': 164, 'lovely': 165, 'sweaters': 166, 'wearing': 167, 'off': 168, 'gorgeous': 169, 'purchased': 170, 'tight': 171, 'perfectly': 172, 'however': 173, 'does': 174, 'enough': 175, 'feel': 176, 'some': 177, 'fall': 178, 'better': 179, 'front': 180, 'person': 181, 'over': 182, 'initmates': 183, 'model': 184, 'definitely': 185, 'what': 186, '39': 187, 'looked': 188, 'been': 189, 'blue': 190, 'jacket': 191, 'though': 192, 'comfy': 193, 'price': 194, 'sale': 195, 'loved': 196, 'lbs': 197, 'how': 198, 'body': 199, 'casual': 200, 'your': 201, 'loose': 202, 'piece': 203, 'wanted': 204, 'high': 205, 'bottom': 206, 'light': 207, 'first': 208, 'try': 209, 'going': 210, 'blouse': 211, 'around': 212, 'skirts': 213, 'shape': 214, 'looking': 215, 'time': 216, 'regular': 217, 'worn': 218, 'many': 219, 'thin': 220, '6': 221, 'chest': 222, 'both': 223, 'make': 224, 'thought': 225, 'through': 226, 'arms': 227, '1': 228, 'pattern': 229, 'fun': 230, 's': 231, 'shoulders': 232, 'saw': 233, 'print': 234, 'makes': 235, 'recommend': 236, 'after': 237, 'gauge': 238, 'bust': 239, 'unique': 240, 'compliments': 241, "doesn't": 242, 'need': 243, 'without': 244, 'want': 245, 'being': 246, 'weight': 247, 'pair': 248, 'wish': 249, 'buy': 250, 'sure': 251, 'wore': 252, 'hips': 253, 'find': 254, "i've": 255, 'side': 256, 'run': 257, 'found': 258, 'went': 259, 'reviews': 260, 'bra': 261, "can't": 262, 'different': 263, '34': 264, 'easy': 265, 'order': 266, 'usual': 267, 'shorts': 268, 'picture': 269, '35': 270, 'since': 271, 'lot': 272, 'by': 273, '36': 274, '8': 275, 'tee': 276, '38': 277, 'longer': 278, 'received': 279, 'detail': 280, 'quite': 281, 'versatile': 282, 'day': 283, 'while': 284, 'its': 285, 'leggings': 286, 'most': 287, 'any': 288, 'absolutely': 289, 'boxy': 290, 'nicely': 291, 'normally': 292, 'm': 293, 'two': 294, 'adorable': 295, 'overall': 296, 'another': 297, 'slightly': 298, 'then': 299, 'tank': 300, 'keep': 301, 'green': 302, 'may': 303, '26': 304, 'wide': 305, 'fitted': 306, 'lace': 307, 'return': 308, 'sizing': 309, 'red': 310, 'felt': 311, 'stretch': 312, '33': 313, 'underneath': 314, 'neck': 315, 'might': 316, '37': 317, 'warm': 318, 'say': 319, 't': 320, 'now': 321, 'happy': 322, 'almost': 323, 'flowy': 324, "wasn't": 325, 'sheer': 326, 'lounge': 327, 'favorite': 328, '10': 329, 'low': 330, 'something': 331, '46': 332, 'photo': 333, 'sizes': 334, 'actually': 335, 'probably': 336, 'extra': 337, '40': 338, '41': 339, 'amazing': 340, 'huge': 341, 'pockets': 342, '32': 343, 'worth': 344, 'every': 345, 'spring': 346, 'purchase': 347, 'area': 348, 'yet': 349, 'such': 350, '29': 351, 'disappointed': 352, '28': 353, 'feminine': 354, 'dressed': 355, 'full': 356, 'tall': 357, "5'4": 358, 'best': 359, 'feels': 360, 'smaller': 361, 'into': 362, 'boots': 363, 'put': 364, 'shorter': 365, '42': 366, 'details': 367, 'said': 368, 'same': 369, '48': 370, 'unfortunately': 371, 'denim': 372, 'everything': 373, 'buttons': 374, 'between': 375, 'winter': 376, '30': 377, '27': 378, 'tunic': 379, 'tts': 380, 'glad': 381, '44': 382, 'navy': 383, 'sized': 384, 'reference': 385, 'neckline': 386, 'skinny': 387, 'always': 388, 'decided': 389, 'pictured': 390, 'give': 391, 'line': 392, '31': 393, 'liked': 394, 'larger': 395, 'maybe': 396, '43': 397, 'should': 398, 'coat': 399, 'thick': 400, 'above': 401, '53': 402, 'lightweight': 403, 'easily': 404, 'wait': 405, 'snug': 406, 'wash': 407, 'part': 408, 'figure': 409, 'fitting': 410, 'cozy': 411, "i'd": 412, 'new': 413, 'take': 414, 'pink': 415, 'problem': 416, 'where': 417, 'especially': 418, 'button': 419, 'returned': 420, 'thing': 421, 'came': 422, 'expected': 423, 'grey': 424, "isn't": 425, '47': 426, 'room': 427, 'seems': 428, 'reviewers': 429, 'classic': 430, 'heavy': 431, '45': 432, 'cardigan': 433, 'cotton': 434, '25': 435, 'slip': 436, 'hem': 437, "5'3": 438, 'l': 439, 'kind': 440, 'normal': 441, 'few': 442, 'know': 443, 'beautifully': 444, '49': 445, "couldn't": 446, 'knit': 447, 'stylish': 448, 'cool': 449, 'curvy': 450, 'lining': 451, 'knee': 452, 'dark': 453, 'embroidery': 454, 'xl': 455, '0': 456, "that's": 457, '56': 458, 'belt': 459, 'never': 460, 'took': 461, 'cami': 462, '12': 463, 'frame': 464, 'before': 465, 'although': 466, 'bad': 467, 'hit': 468, 'lined': 469, "i'll": 470, 'torso': 471, 'typically': 472, 'goes': 473, 'extremely': 474, 'washed': 475, 'able': 476, 'arm': 477, "5'5": 478, 'excited': 479, 'ever': 480, "5'2": 481, 'works': 482, 'who': 483, 'vest': 484, 'legs': 485, 'reviewer': 486, 'hits': 487, 'stunning': 488, '50': 489, 'returning': 490, 'someone': 491, 'once': 492, 'highly': 493, 'show': 494, 'anything': 495, 'xxs': 496, 'those': 497, 'weather': 498, 'simple': 499, 'either': 500, 'type': 501, 'wardrobe': 502, '52': 503, 'suit': 504, 'unflattering': 505, 'shoulder': 506, 'zipper': 507, 'photos': 508, 'issue': 509, 'ended': 510, "5'7": 511, 'year': 512, 'today': 513, 'things': 514, 'needed': 515, 'stretchy': 516, 'exactly': 517, 'gray': 518, 'staple': 519, 'sleeve': 520, 'sold': 521, 'why': 522, '51': 523, 'tie': 524, 'swing': 525, 'layer': 526, "you're": 527, 'hot': 528, 'shown': 529, 'trying': 530, 'already': 531, 'fell': 532, 'below': 533, 'days': 534, 'arrived': 535, 'wonderful': 536, 'swim': 537, 'last': 538, 'layering': 539, 'brand': 540, 'must': 541, 'orange': 542, '60': 543, 'agree': 544, 'itchy': 545, "5'6": 546, 'trend': 547, 'add': 548, 'slim': 549, 'nothing': 550, 'basic': 551, 'skin': 552, 'otherwise': 553, 'away': 554, 'tad': 555, "wouldn't": 556, 'straps': 557, 'cheap': 558, 'come': 559, 'holes': 560, 'getting': 561, 'again': 562, 'sadly': 563, 'baggy': 564, 'due': 565, '54': 566, 'odd': 567, '57': 568, 'local': 569, 'hoping': 570, 'dry': 571, 'delicate': 572, "they're": 573, 'sides': 574, 'hard': 575, 'across': 576, 'vibrant': 577, 'ran': 578, "5'8": 579, '62': 580, 'inches': 581, 'outerwear': 582, 'weird': 583, 'others': 584, 'pilcro': 585, 'cropped': 586, 'plus': 587, 'several': 588, 'rather': 589, 'waisted': 590, 'elegant': 591, 'bright': 592, 'awesome': 593, 'surprised': 594, 'busty': 595, 'ordering': 596, 'special': 597, 'heels': 598, 'less': 599, 'seem': 600, 'keeping': 601, 'drape': 602, 'straight': 603, "5'": 604, 'roomy': 605, 'open': 606, 'version': 607, 'hangs': 608, 'least': 609, 'use': 610, 'yellow': 611, 'kept': 612, 'touch': 613, '55': 614, 'night': 615, '66': 616, 'pictures': 617, 'pull': 618, 'plan': 619, 'based': 620, 'pounds': 621, 'ok': 622, 'own': 623, 'oversized': 624, 'drapes': 625, 'seemed': 626, 'v': 627, 'sometimes': 628, 'season': 629, 'ankle': 630, 'paired': 631, 'review': 632, 'tights': 633, 'shirts': 634, 'far': 635, '24': 636, 'elastic': 637, 'jumpsuit': 638, '59': 639, 'lots': 640, 'their': 641, 'bigger': 642, 'worked': 643, 'myself': 644, 'totally': 645, 'flare': 646, 'available': 647, 'having': 648, 'flat': 649, 'wedding': 650, 'jean': 651, 'here': 652, 'gives': 653, 'she': 654, 'booties': 655, '58': 656, 'sexy': 657, 'making': 658, 'immediately': 659, 'poor': 660, 'mentioned': 661, "5'1": 662, 'mine': 663, 'cover': 664, 'completely': 665, 'falls': 666, '115': 667, 'collar': 668, 'times': 669, 'interesting': 670, 'wool': 671, 'end': 672, 'sleep': 673, 'instead': 674, 'stiff': 675, 'tiny': 676, 'higher': 677, 'leg': 678, 'ivory': 679, 'cold': 680, "won't": 681, 'finally': 682, 'classy': 683, 'neutral': 684, 'pant': 685, 'yes': 686, 'mid': 687, 'previous': 688, "there's": 689, 'dressy': 690, 'throw': 691, 'shows': 692, 'athletic': 693, 'slimming': 694, 'wrong': 695, 'lower': 696, 'people': 697, 'read': 698, 'buying': 699, 'thinking': 700, 'substantial': 701, 'adds': 702, 'sandals': 703, 'texture': 704, '23': 705, 'cream': 706, 'used': 707, 'form': 708, 'knees': 709, 'incredibly': 710, 'chic': 711, "5'9": 712, 'eye': 713, 'fact': 714, 'clothes': 715, 'months': 716, 'similar': 717, '64': 718, 'fabulous': 719, 'curves': 720, 'addition': 721, 'half': 722, 'close': 723, 'knew': 724, 'armholes': 725, 'closet': 726, 'product': 727, 'bulky': 728, 'else': 729, 'idea': 730, 'outfit': 731, 'maternity': 732, 'stripes': 733, 'inside': 734, 'hourglass': 735, 'broad': 736, 'linen': 737, 'prefer': 738, 'worried': 739, 'gotten': 740, 'detailing': 741, 'rich': 742, 'often': 743, 'hip': 744, 'home': 745, 'clothing': 746, 'brown': 747, '65': 748, 'reason': 749, 'flats': 750, 'sweatshirt': 751, '120': 752, 'subtle': 753, 'second': 754, 'silk': 755, 'peplum': 756, 'baby': 757, 'rise': 758, 'amount': 759, '20': 760, 'depending': 761, 'expect': 762, 'deep': 763, 'next': 764, 'during': 765, 'appropriate': 766, '61': 767, 'comfort': 768, 'tent': 769, 'gave': 770, 'c': 771, 'fantastic': 772, 'except': 773, 'until': 774, 'floral': 775, '63': 776, 'mind': 777, 'mail': 778, 'strange': 779, 'b': 780, 'appears': 781, 'flow': 782, 'couple': 783, 'sad': 784, 'blazer': 785, 'romper': 786, 'hope': 787, 'w': 788, 'purple': 789, 'soon': 790, 'thicker': 791, 'left': 792, 'tell': 793, 'upper': 794, 'washing': 795, 'hang': 796, 'awkward': 797, 'fan': 798, 'point': 799, '34c': 800, 'complaint': 801, 'together': 802, 'issues': 803, 'pleased': 804, 'colored': 805, 'maeve': 806, 'disappointing': 807, 'along': 808, 'thighs': 809, 'anyway': 810, 'hold': 811, '34b': 812, 'perhaps': 813, 'nude': 814, 'taller': 815, 'hand': 816, 'her': 817, '125': 818, 'looser': 819, 'build': 820, 'live': 821, 'deal': 822, 'gone': 823, 'added': 824, '135': 825, 'forward': 826, 'justice': 827, '140': 828, 'exchange': 829, 'slight': 830, 'somewhat': 831, 'maxi': 832, 'everywhere': 833, 'real': 834, 'tag': 835, 'needs': 836, 'cannot': 837, 'fairly': 838, 'please': 839, 'inch': 840, 'case': 841, 'wow': 842, 'hung': 843, 'sweet': 844, 'truly': 845, 'beach': 846, 'feeling': 847, 'place': 848, 'girls': 849, 'sort': 850, 'middle': 851, "haven't": 852, 'bodice': 853, 'seam': 854, 'motif': 855, 'spot': 856, 'flows': 857, 'past': 858, 'pulled': 859, 'seams': 860, 'whole': 861, 'coverage': 862, 'butt': 863, 'leather': 864, 'meant': 865, 'expecting': 866, 'excellent': 867, 'simply': 868, 'women': 869, 'party': 870, 'office': 871, 'pieces': 872, 'transition': 873, 'narrow': 874, 'scratchy': 875, 'opinion': 876, 'we': 877, 'pairs': 878, 'airy': 879, 'items': 880, 'heavier': 881, 'tummy': 882, 'waistband': 883, 'says': 884, 'plaid': 885, 'itself': 886, 'zip': 887, 'weekend': 888, 'band': 889, 'typical': 890, 'done': 891, 'girl': 892, 'generally': 893, 'styling': 894, 'stitching': 895, 'slender': 896, 'barely': 897, 'sewn': 898, 'relaxed': 899, 'imagine': 900, '130': 901, 'three': 902, 'reading': 903, '100': 904, 'legwear': 905, 'seen': 906, 'stay': 907, 'unless': 908, 'noticed': 909, 'darker': 910, 'dinner': 911, 'difficult': 912, 'bag': 913, 'chested': 914, 'each': 915, 'belly': 916, '14': 917, 'lay': 918, 'everyday': 919, 'ago': 920, 'gold': 921, 'oh': 922, 'showing': 923, 'clean': 924, 'ladies': 925, 'husband': 926, 'swingy': 927, 'everyone': 928, 'pregnant': 929, 'lines': 930, 'portion': 931, 'sack': 932, 'clingy': 933, 'wider': 934, 'anyone': 935, 'shapeless': 936, 'ag': 937, '67': 938, 'believe': 939, 'prettier': 940, 'surprise': 941, '34d': 942, 'frumpy': 943, 'places': 944, 'seasons': 945, 'slits': 946, 'necklace': 947, 'running': 948, "5'10": 949, 'care': 950, 'tailored': 951, 'okay': 952, 'chance': 953, 'cuter': 954, 'lighter': 955, 'perfection': 956, 'keeper': 957, 'shade': 958, 'shaped': 959, 'petites': 960, 'rest': 961, 'coral': 962, 'beauty': 963, 'finding': 964, 'caught': 965, 'money': 966, 'intimates': 967, 'tucked': 968, 'help': 969, 'thigh': 970, 'romantic': 971, 'hanging': 972, 'cup': 973, 'pleats': 974, 'stomach': 975, 'height': 976, 'silhouette': 977, "aren't": 978, 'solid': 979, 'polyester': 980, 'expensive': 981, 'dressing': 982, 'guess': 983, 'p': 984, 'silky': 985, 'plenty': 986, 'd': 987, 'flowing': 988, 'description': 989, 'patterns': 990, 'purchasing': 991, 'layers': 992, 'structured': 993, 'life': 994, 'poncho': 995, 'pear': 996, 'beige': 997, 'uncomfortable': 998, 'vintage': 999, 'camisole': 1000, ...,'bored': 13489} # 1001-13488 has been removed due to word limit

print("The encoding for document\n",text_train[1356],"\n is : ",sequences_train[1356])

The encoding for document
 This item comes from the Bottoms department and General division, and is classified under Pants. I am 36 years old. I rate this item 5 out of 5 stars. Lovely. These trousers are wonderful. the fabric is comfortable and does have a little give to it. 
 is :  [4, 6, 17, 14, 2, 78, 21, 3, 25, 22, 3, 5, 23, 15, 85, 1, 11, 445, 16, 18, 1, 20, 4, 6, 54, 12, 9, 10, 19, 36, 76, 1, 93, 189, 1410, 76, 28, 7, 284, 26, 50, 63, 148, 600, 41, 3478, 1, 1314, 412, 405, 28, 7, 195, 1, 682, 99, 111, 65, 2, 338, 168, 9, 195, 194, 2135, 3, 1, 36, 111, 573, 474, 82, 3, 41, 73, 412, 319, 63, 257, 156, 13, 40, 26, 396, 27, 2, 361, 672, 9, 2, 40, 28, 2, 118, 2, 190, 85, 47, 74, 2528, 24, 64, 95, 2, 269, 692, 26, 153, 69, 57, 215, 826, 13, 167, 76, 321, 3, 362, 2, 346]

But the encoding is not right when I checked this particular sentence for example. What have I not done right? Thanks a lot.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文