From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 926DB158087 for ; Thu, 3 Feb 2022 21:23:47 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id C422FE07C9; Thu, 3 Feb 2022 21:23:45 +0000 (UTC) Received: from smtp.gentoo.org (woodpecker.gentoo.org [140.211.166.183]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id A8C17E07C9 for ; Thu, 3 Feb 2022 21:23:45 +0000 (UTC) Received: from oystercatcher.gentoo.org (unknown [IPv6:2a01:4f8:202:4333:225:90ff:fed9:fc84]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id B65F03432F9 for ; Thu, 3 Feb 2022 21:23:44 +0000 (UTC) Received: from localhost.localdomain (localhost [IPv6:::1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id 68EA81BD for ; Thu, 3 Feb 2022 21:23:43 +0000 (UTC) From: "Magnus Granberg" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Magnus Granberg" Message-ID: <1643923408.57d8ef85943af7ddfeaa51d0cacdacba80a75a13.zorry@gentoo> Subject: [gentoo-commits] proj/tinderbox-cluster:master commit in: buildbot_gentoo_ci/logs/ X-VCS-Repository: proj/tinderbox-cluster X-VCS-Files: buildbot_gentoo_ci/logs/log_parser.py X-VCS-Directories: buildbot_gentoo_ci/logs/ X-VCS-Committer: zorry X-VCS-Committer-Name: Magnus Granberg X-VCS-Revision: 57d8ef85943af7ddfeaa51d0cacdacba80a75a13 X-VCS-Branch: master Date: Thu, 3 Feb 2022 21:23:43 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply X-Archives-Salt: 49c07429-a4b8-43cb-a64f-477381fdc628 X-Archives-Hash: db417eb7da77ac1701432ec6c3b5140b commit: 57d8ef85943af7ddfeaa51d0cacdacba80a75a13 Author: Magnus Granberg gentoo org> AuthorDate: Thu Feb 3 21:23:28 2022 +0000 Commit: Magnus Granberg gentoo org> CommitDate: Thu Feb 3 21:23:28 2022 +0000 URL: https://gitweb.gentoo.org/proj/tinderbox-cluster.git/commit/?id=57d8ef85 Fix UnicodeDecodeError in log parser Signed-off-by: Magnus Granberg gentoo.org> buildbot_gentoo_ci/logs/log_parser.py | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/buildbot_gentoo_ci/logs/log_parser.py b/buildbot_gentoo_ci/logs/log_parser.py index 7531532..448192c 100644 --- a/buildbot_gentoo_ci/logs/log_parser.py +++ b/buildbot_gentoo_ci/logs/log_parser.py @@ -166,13 +166,10 @@ def runLogParser(args): # Is stored in a db instead of files. log_search_pattern_list = get_log_search_pattern(Session, args.uuid, config['default_uuid']) Session.close() - #FIXME: UnicodeDecodeError: 'utf-8' codec can't decode byte ... in some logs - with io.TextIOWrapper(io.BufferedReader(gzip.open(args.file, 'rb'))) as f: - for text_line in f: - logfile_text_dict[index] = text_line.strip('\n') - index = index + 1 - max_text_lines = index - f.close() + for text_line in io.TextIOWrapper(io.BufferedReader(gzip.open(args.file)), encoding='utf8', errors='ignore'): + logfile_text_dict[index] = text_line.strip('\n') + index = index + 1 + max_text_lines = index # run the parse patten on the line for tmp_index, text in logfile_text_dict.items(): res = mp_pool.apply_async(search_buildlog, (log_search_pattern_list, logfile_text_dict, tmp_index, max_text_lines,))