前往小程序,Get更优阅读体验!
立即前往
发布
社区首页 >专栏 >人工智能学术速递[8.24]

人工智能学术速递[8.24]

作者头像
公众号-arXiv每日学术速递
发布2021-08-25 16:11:46
发布2021-08-25 16:11:46
1.7K0
举报

Update!H5支持摘要折叠,体验更佳!点击阅读原文访问arxivdaily.com,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏等功能!

cs.AI人工智能,共计70篇

【1】 A Simplicial Model for KB4_n: Epistemic Logic with Agents that May Die标题:Kb4_n的单纯模型:具有可能死亡的主体的认知逻辑链接:https://arxiv.org/abs/2108.10293

作者:Eric Goubault,Jérémy Ledent,Sergio Rajsbaum 机构:MSP Group, University of Strathclyde, Glasgow, Scotland, UNAM, Mexico D.F., Mexico 摘要:多智能体认知逻辑$S5$的标准语义基于Kripke模型,其可及性关系是自反的、对称的和传递的。这种一维结构包含了超越成对交互的隐式高维信息,在作者之前的工作中,这种信息被形式化为纯单纯形模型。在这里,我们将理论扩展到包括所有简单模型——包括那些不纯粹的模型。相应的Kripke模型是可及性关系是对称的和可传递的,但可能不是自反的。这就产生了认知逻辑$KB4$,它可以对某些代理可能死亡的情况进行推理。 摘要:The standard semantics of multi-agent epistemic logic $S5$ is based on Kripke models whose accessibility relations are reflexive, symmetric and transitive. This one dimensional structure contains implicit higher-dimensional information beyond pairwise interactions, that has been formalized as pure simplicial models in previous work from the authors. Here we extend the theory to encompass all simplicial models - including the ones that are not pure. The corresponding Kripke models are those where the accessibility relation is symmetric and transitive, but might not be reflexive. This yields the epistemic logic $KB4$ which can reason about situations where some of the agents may die.

【2】 ChiNet: Deep Recurrent Convolutional Learning for Multimodal Spacecraft Pose Estimation 标题:CHINET:用于多模态航天器姿态估计的深度递归卷积学习 链接:https://arxiv.org/abs/2108.10282

作者:Duarte Rondao,Nabil Aouf,Mark A. Richardson 机构: Aouf is a Professor of Robotics and Autonomous Systems with theDepartment of Electrical and Electronic Engineering at City, University ofLondon 备注:This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible 摘要:本文提出了一种创新的深度学习管道,该管道通过结合交会序列的时间信息来估计航天器的相对姿态。它利用长短期记忆(LSTM)单元在数据序列建模中的性能来处理卷积神经网络(CNN)主干提取的特征。三种不同的训练策略,遵循从粗到细的漏斗式方法,结合起来促进特征学习,并通过回归改进端到端姿势估计。利用CNN自主确定图像特征表示的能力,将热红外数据与红-绿-蓝(RGB)输入进行融合,从而减轻在可见光波长下对空间物体成像产生的伪影的影响。在一个合成数据集上演示了所提出框架(称为ChiNet)的每一项贡献,并在实验数据上验证了整个管道。 摘要:This paper presents an innovative deep learning pipeline which estimates the relative pose of a spacecraft by incorporating the temporal information from a rendezvous sequence. It leverages the performance of long short-term memory (LSTM) units in modelling sequences of data for the processing of features extracted by a convolutional neural network (CNN) backbone. Three distinct training strategies, which follow a coarse-to-fine funnelled approach, are combined to facilitate feature learning and improve end-to-end pose estimation by regression. The capability of CNNs to autonomously ascertain feature representations from images is exploited to fuse thermal infrared data with red-green-blue (RGB) inputs, thus mitigating the effects of artefacts from imaging space objects in the visible wavelength. Each contribution of the proposed framework, dubbed ChiNet, is demonstrated on a synthetic dataset, and the complete pipeline is validated on experimental data.

【3】 Automatic Speech Recognition using limited vocabulary: A survey 标题:使用有限词汇量的自动语音识别:综述 链接:https://arxiv.org/abs/2108.10254

作者:Jean Louis K. E. Fendji,Diane M. Tala,Blaise O. Yenke,Marcellin Atemkeng 机构:Department Mathematics and Computer Science, University of Ngaoundere, Ngaoundere, Cameroon (e-mail: 备注:20 pages, 9 figures, 6 tables, submitted to IEEE ACCESS for possible publication 摘要:自动语音识别(ASR)是一个非常活跃的研究领域,因为它有着大量的应用和支持语音处理的接口或计算设备。但大部分应用程序都基于资源丰富的语言,而这些语言的资源不足。然而,ASR代表了一种不可否认的推广此类语言的手段,尤其是在设计涉及文盲的人对人或人对机器系统时。设计面向资源不足语言的ASR系统的一种方法是从有限的词汇表开始。ASR使用有限的词汇量是语音识别问题的一个子集,其重点是识别少量的单词或句子。本文旨在通过有限的词汇,对ASR系统背后的机制、技术、工具、项目、最近的贡献以及可能的未来方向提供一个全面的视角。因此,这项工作为使用有限词汇设计ASR系统提供了一条途径。虽然重点放在有限的词汇上,但本次调查中报告的大多数工具和技术一般适用于ASR系统。 摘要:Automatic Speech Recognition (ASR) is an active field of research due to its huge number of applications and the proliferation of interfaces or computing devices that can support speech processing. But the bulk of applications is based on well-resourced languages that overshadow under-resourced ones. Yet ASR represents an undeniable mean to promote such languages, especially when design human-to-human or human-to-machine systems involving illiterate people. An approach to design an ASR system targeting under-resourced languages is to start with a limited vocabulary. ASR using a limited vocabulary is a subset of the speech recognition problem that focuses on the recognition of a small number of words or sentences. This paper aims to provide a comprehensive view of mechanisms behind ASR systems as well as techniques, tools, projects, recent contributions, and possibly future directions in ASR using a limited vocabulary. This work consequently provides a way to go when designing ASR system using limited vocabulary. Although an emphasis is put on limited vocabulary, most of the tools and techniques reported in this survey applied to ASR systems in general.

【4】 Federated Multi-Task Learning under a Mixture of Distributions 标题:混合分布下的联合多任务学习 链接:https://arxiv.org/abs/2108.10252

作者:Othmane Marfoq,Giovanni Neglia,Aurélien Bellet,Laetitia Kameni,Richard Vidal 机构:Inria, Université Côte d’Azur, Sophia Antipolis, France, Inria, Lille, France, Accenture Labs, Sophia Antipolis, France 备注:73 pages 摘要:智能手机和物联网设备生成的数据越来越大,这推动了联合学习(FL)的发展,FL是一种机器学习模型的设备上协作训练框架。FL的最初努力集中于学习单个全局模型,该模型在客户中具有良好的平均性能,但由于本地数据分布的固有异构性,全局模型对于给定客户可能是任意不好的。联邦多任务学习(MTL)方法可以通过制定一个适当的惩罚优化问题来学习个性化模型。惩罚项可以捕捉个性化模型之间的复杂关系,但避开了关于局部数据分布的明确统计假设。在这项工作中,我们建议在灵活的假设下研究联邦MTL,即每个本地数据分布是未知底层分布的混合。这一假设涵盖了大多数现有的个性化FL方法,并为客户机-服务器和完全分散的设置带来了类似EM的联合算法。此外,它还提供了一种原则性的方法,为训练时未见到的客户提供个性化模型。通过一个新的联邦代理优化框架分析了算法的收敛性,该框架具有普遍意义。FL基准测试的实验结果表明,在大多数情况下,我们的方法提供的模型比最先进的方法具有更高的准确性和公平性。 摘要:The increasing size of data generated by smartphones and IoT devices motivated the development of Federated Learning (FL), a framework for on-device collaborative training of machine learning models. First efforts in FL focused on learning a single global model with good average performance across clients, but the global model may be arbitrarily bad for a given client, due to the inherent heterogeneity of local data distributions. Federated multi-task learning (MTL) approaches can learn personalized models by formulating an opportune penalized optimization problem. The penalization term can capture complex relations among personalized models, but eschews clear statistical assumptions about local data distributions. In this work, we propose to study federated MTL under the flexible assumption that each local data distribution is a mixture of unknown underlying distributions. This assumption encompasses most of the existing personalized FL approaches and leads to federated EM-like algorithms for both client-server and fully decentralized settings. Moreover, it provides a principled way to serve personalized models to clients not seen at training time. The algorithms' convergence is analyzed through a novel federated surrogate optimization framework, which can be of general interest. Experimental results on FL benchmarks show that in most cases our approach provides models with higher accuracy and fairness than state-of-the-art methods.

【5】 Fusion of evidential CNN classifiers for image classification 标题:融合证据CNN分类器的图像分类方法 链接:https://arxiv.org/abs/2108.10233

作者:Zheng Tong,Philippe Xu,Thierry Denoeux 机构:Thierry Denœux,[,−,−,−,], Universit´e de technologie de Compiegne, CNRS, UMR , Heudiasyc, Compiegne, France, Institut universitaire de France, Paris, France 摘要:我们提出了一种基于信念函数的信息融合方法来结合卷积神经网络。在这种方法中,几个预先训练好的基于DS的CNN结构从输入图像中提取特征,并将其转换为不同识别帧上的质量函数。然后,融合模块使用Dempster规则聚合这些质量函数。端到端学习过程允许我们使用带有软标签的学习集来微调总体架构,这进一步提高了分类性能。使用三个基准数据库对该方法的有效性进行了实验验证。 摘要:We propose an information-fusion approach based on belief functions to combine convolutional neural networks. In this approach, several pre-trained DS-based CNN architectures extract features from input images and convert them into mass functions on different frames of discernment. A fusion module then aggregates these mass functions using Dempster's rule. An end-to-end learning procedure allows us to fine-tune the overall architecture using a learning set with soft labels, which further improves the classification performance. The effectiveness of this approach is demonstrated experimentally using three benchmark databases.

【6】 Smoother Entropy for Active State Trajectory Estimation and Obfuscation in POMDPs 标题:基于平滑熵的POMDP主动状态弹道估计与模糊处理 链接:https://arxiv.org/abs/2108.10227

作者:Timothy L. Molloy,Girish N. Nair 机构: Applications in which the problem has been investigated in its active estimationThe authors are with the Department of Electrical and Electronic Engineering, University of Melbourne 备注:41 pages, 2 figures, submitted (under review) 摘要:我们研究了通过优化给定测量和控制的状态轨迹的条件熵来控制部分观测马尔可夫决策过程(POMDP)以帮助或阻碍其状态轨迹估计的问题,我们称之为平滑熵。我们对更平滑熵的考虑与以前的主动状态估计和模糊处理方法形成对比,后者由于可处理性问题而诉诸于边际(或瞬时)状态不确定性的度量。通过在通常的POMDP信念状态下建立平滑熵的新表达式,我们证明了我们的主动估计和模糊问题可以重新表述为在信念状态下完全观察到的马尔可夫决策过程(MDP)。令人惊讶的是,我们识别了具有凹成本和去成本函数的主动估计和模糊处理的信念状态MDP格式,这使得使用标准POMDP技术能够构造可处理的有界误差(近似)解。我们在仿真中表明,和其他方法相比,优化更平滑的熵会导致更好的轨迹估计和模糊处理。 摘要:We study the problem of controlling a partially observed Markov decision process (POMDP) to either aid or hinder the estimation of its state trajectory by optimising the conditional entropy of the state trajectory given measurements and controls, a quantity we dub the smoother entropy. Our consideration of the smoother entropy contrasts with previous active state estimation and obfuscation approaches that instead resort to measures of marginal (or instantaneous) state uncertainty due to tractability concerns. By establishing novel expressions of the smoother entropy in terms of the usual POMDP belief state, we show that our active estimation and obfuscation problems can be reformulated as Markov decision processes (MDPs) that are fully observed in the belief state. Surprisingly, we identify belief-state MDP reformulations of both active estimation and obfuscation with concave cost and cost-to-go functions, which enables the use of standard POMDP techniques to construct tractable bounded-error (approximate) solutions. We show in simulations that optimisation of the smoother entropy leads to superior trajectory estimation and obfuscation compared to alternative approaches.

【7】 A New Constructive Heuristic driven by Machine Learning for the Traveling Salesman Problem 标题:机器学习驱动的旅行商问题的一种新的构造性启发式算法 链接:https://arxiv.org/abs/2108.10224

作者:Umberto Junior Mele,Luca Maria Gambardella,Roberto Montemanni 机构:Department of Sciences and Methods for Engineering, University of Modena and Reggio Emilia, Reggio Emilia, Italy 摘要:最近使用机器学习(ML)来解决旅行商问题(TSP)的系统在试图扩展到具有数百个顶点的真实场景时会出现问题。候选人名单(CLs)的使用被提出来解决这些问题。该过程允许在创建解的过程中限制搜索空间,从而减少解算器的计算负担。到目前为止,ML参与创建CLs和这些CLs边缘上的值,以表达溶液插入时的ML偏好。尽管前景看好,但这些系统并没有明确限制ML学习和创建解决方案的内容,这带来了一些泛化问题。因此,在探索性和统计研究的推动下,在这项工作中,我们使用机器学习模型来确认仅针对高概率边的解决方案中的添加。采用高概率边的CLs作为输入,ML负责区分这些边处于最优解的情况和不处于最优解的情况。这种策略可以实现更好的泛化,并在机器学习和搜索技术之间建立有效的平衡。我们的ML构造性启发式是在小实例上训练的。然后,它也能够在不损失质量的情况下,为大问题提供解决方案。我们将我们的结果与经典的构造性启发式算法进行了比较,在1748个城市的TSPLIB实例中显示了良好的性能。尽管我们的启发式算法表现出昂贵的常数时间操作,但我们证明了在最坏情况下,对于训练后的解决方案构造,计算复杂度为$O(n^2\log n^2)$,即$n$TSP实例中的顶点数。 摘要:Recent systems applying Machine Learning (ML) to solve the Traveling Salesman Problem (TSP) exhibit issues when they try to scale up to real case scenarios with several hundred vertices. The use of Candidate Lists (CLs) has been brought up to cope with the issues. The procedure allows to restrict the search space during solution creation, consequently reducing the solver computational burden. So far, ML were engaged to create CLs and values on the edges of these CLs expressing ML preferences at solution insertion. Although promising, these systems do not clearly restrict what the ML learns and does to create solutions, bringing with them some generalization issues. Therefore, motivated by exploratory and statistical studies, in this work we instead use a machine learning model to confirm the addition in the solution just for high probable edges. CLs of the high probable edge are employed as input, and the ML is in charge of distinguishing cases where such edges are in the optimal solution from those where they are not. . This strategy enables a better generalization and creates an efficient balance between machine learning and searching techniques. Our ML-Constructive heuristic is trained on small instances. Then, it is able to produce solutions, without losing quality, to large problems as well. We compare our results with classic constructive heuristics, showing good performances for TSPLIB instances up to 1748 cities. Although our heuristic exhibits an expensive constant time operation, we proved that the computational complexity in worst-case scenario, for the solution construction after training, is $O(n^2 \log n^2)$, being $n$ the number of vertices in the TSP instance.

【8】 Deep Bayesian Image Set Classification: A Defence Approach against Adversarial Attacks 标题:深度贝叶斯图像集分类:一种防御敌意攻击的方法 链接:https://arxiv.org/abs/2108.10217

作者:Nima Mirnateghi,Syed Afaq Ali Shah,Mohammed Bennamoun 机构: Murdoch University, Bennamoun is with the Department of Computer Science and SoftwareEngineering, University of Western Australia 摘要:近年来,由于深度学习在目标识别、人脸识别和场景理解方面的突出成就,它已成为各种计算机视觉系统的一个组成部分。然而,深层神经网络(DNN)很容易被对手以近乎高的置信度愚弄。在实践中,深度学习系统对仔细扰动的图像(称为对抗性示例)的脆弱性在物理世界应用程序中构成了严重的安全威胁。为了解决这一现象,据我们所知,我们提出了第一种基于图像集的对抗性防御方法。由于其处理外观变化的固有特性,图像集分类在对象和人脸识别方面表现出了优异的性能。我们提出了一种鲁棒的深层贝叶斯图像集分类方法,作为抵御各种敌对攻击的防御框架。我们用几种投票策略对所提出的技术的性能进行了广泛的实验。我们进一步分析了图像大小、扰动幅度以及每个图像集中扰动图像的比率的影响。我们还使用最新的最先进的防御方法和单发识别任务来评估我们的技术。实证结果表明,在CIFAR-10、MNIST、ETH-80和微型ImageNet数据集上具有优异的性能。 摘要:Deep learning has become an integral part of various computer vision systems in recent years due to its outstanding achievements for object recognition, facial recognition, and scene understanding. However, deep neural networks (DNNs) are susceptible to be fooled with nearly high confidence by an adversary. In practice, the vulnerability of deep learning systems against carefully perturbed images, known as adversarial examples, poses a dire security threat in the physical world applications. To address this phenomenon, we present, what to our knowledge, is the first ever image set based adversarial defence approach. Image set classification has shown an exceptional performance for object and face recognition, owing to its intrinsic property of handling appearance variability. We propose a robust deep Bayesian image set classification as a defence framework against a broad range of adversarial attacks. We extensively experiment the performance of the proposed technique with several voting strategies. We further analyse the effects of image size, perturbation magnitude, along with the ratio of perturbed images in each image set. We also evaluate our technique with the recent state-of-the-art defence methods, and single-shot recognition task. The empirical results demonstrate superior performance on CIFAR-10, MNIST, ETH-80, and Tiny ImageNet datasets.

【9】 CGEMs: A Metric Model for Automatic Code Generation using GPT-3 标题:CGEMS:基于GPT-3的代码自动生成度量模型 链接:https://arxiv.org/abs/2108.10168

作者:Aishwarya Narasimhan,Krishna Prasad Agara Venkatesha Rao,Veena M B 备注:11 pages, 6 figures, 2 tables 摘要:如今,人工智能技术在几乎所有行业和各行各业都显示出其优势。从文本生成、文本摘要、聊天机器人到自然语言处理都得到了广泛的应用。其中一个范例是自动代码生成。人工智能可以产生任何东西;因此,输出空间是无约束的。自动驾驶汽车行驶1亿英里以验证其安全性,但无法编写测试来监控和覆盖无约束空间。验证AI生成内容的解决方案之一是约束问题并将其从抽象转换为现实,这可以通过使用理论证明验证无约束算法或使用蒙特卡罗模拟方法来实现。在这种情况下,我们使用后一种方法来测试/验证统计上显著数量的样本。验证人工智能生成代码的假设是这项工作的主要动机,为了了解人工智能生成代码是否可靠,提出了一个度量模型CGEMs。这是一项极具挑战性的任务,因为程序可以具有不同命名约定的不同逻辑,但度量必须捕获程序的结构和逻辑。这类似于语法在基于人工智能的文本生成、问答、翻译等方面的重要性。这项工作中收集的支持生成代码评估的各种指标如下:编译、NL描述到逻辑转换、所需编辑数量、,一些常用的静态代码度量和NLP度量。这些指标应用于使用OpenAI的GPT-3.Post生成的80个代码,其中一个神经网络用于二进制分类(生成代码的质量可接受/不可接受)。该网络的输入是从度量中获得的特征值。该模型的分类准确率为76.92%,F1评分为55.56%。XAI在模型可解释性方面得到了增强。 摘要:Today, AI technology is showing its strengths in almost every industry and walks of life. From text generation, text summarization, chatbots, NLP is being used widely. One such paradigm is automatic code generation. An AI could be generating anything; hence the output space is unconstrained. A self-driving car is driven for 100 million miles to validate its safety, but tests cannot be written to monitor and cover an unconstrained space. One of the solutions to validate AI-generated content is to constrain the problem and convert it from abstract to realistic, and this can be accomplished by either validating the unconstrained algorithm using theoretical proofs or by using Monte-Carlo simulation methods. In this case, we use the latter approach to test/validate a statistically significant number of samples. This hypothesis of validating the AI-generated code is the main motive of this work and to know if AI-generated code is reliable, a metric model CGEMs is proposed. This is an extremely challenging task as programs can have different logic with different naming conventions, but the metrics must capture the structure and logic of the program. This is similar to the importance grammar carries in AI-based text generation, Q&A, translations, etc. The various metrics that are garnered in this work to support the evaluation of generated code are as follows: Compilation, NL description to logic conversion, number of edits needed, some of the commonly used static-code metrics and NLP metrics. These metrics are applied to 80 codes generated using OpenAI's GPT-3. Post which a Neural network is designed for binary classification (acceptable/not acceptable quality of the generated code). The inputs to this network are the values of the features obtained from the metrics. The model achieves a classification accuracy of 76.92% and an F1 score of 55.56%. XAI is augmented for model interpretability.

【10】 Spatio-Temporal Split Learning for Privacy-Preserving Medical Platforms: Case Studies with COVID-19 CT, X-Ray, and Cholesterol Data 标题:隐私保护医学平台的时空分裂学习:冠状病毒CT、X射线和胆固醇数据的案例研究 链接:https://arxiv.org/abs/2108.10147

作者:Yoo Jeong Ha,Minjae Yoo,Gusang Lee,Soyi Jung,Sae Won Choi,Joongheon Kim,Seehwan Yoo 机构:This work was supported by the Ministry of Health and Welfare (MHW), Korea (HI,C,). 摘要:机器学习需要大量的样本数据,尤其是在高精度医学应用中。然而,患者记录是最敏感的私人信息之一,通常不会在研究所之间共享。本文介绍了时空分割学习,一种分布式深度神经网络框架,它是允许隐私敏感组织之间协作的一个转折点。我们的时空分割学习展示了分布式机器学习如何在最小隐私关注的情况下高效地进行。提出的分割学习由多个客户端和一个集中式服务器组成。每个客户端只有一个隐藏层,作为隐私保护层,集中式服务器包括其他隐藏层和输出层。由于集中式服务器不需要访问训练数据,并且使用从隐私保护层接收的参数训练深度神经网络,因此保证了原始数据的隐私性。我们创造了时空分割学习这一术语,因为多个客户端在空间上分布以覆盖来自不同参与者的不同数据集,我们可以在时间上分割学习过程,将隐私保护层与学习过程的其余部分分离,以最大限度地减少隐私泄露。本文展示了如何使用我们提出的多站点时空分割学习算法对冠状病毒病-19(COVID-19)、胸部计算机断层扫描(CT)、肌肉骨骼X线照片(MURA)X射线图像和胆固醇水平进行分析,同时确保隐私。 摘要:Machine learning requires a large volume of sample data, especially when it is used in high-accuracy medical applications. However, patient records are one of the most sensitive private information that is not usually shared among institutes. This paper presents spatio-temporal split learning, a distributed deep neural network framework, which is a turning point in allowing collaboration among privacy-sensitive organizations. Our spatio-temporal split learning presents how distributed machine learning can be efficiently conducted with minimal privacy concerns. The proposed split learning consists of a number of clients and a centralized server. Each client has only has one hidden layer, which acts as the privacy-preserving layer, and the centralized server comprises the other hidden layers and the output layer. Since the centralized server does not need to access the training data and trains the deep neural network with parameters received from the privacy-preserving layer, privacy of original data is guaranteed. We have coined the term, spatio-temporal split learning, as multiple clients are spatially distributed to cover diverse datasets from different participants, and we can temporally split the learning process, detaching the privacy preserving layer from the rest of the learning process to minimize privacy breaches. This paper shows how we can analyze the medical data whilst ensuring privacy using our proposed multi-site spatio-temporal split learning algorithm on Coronavirus Disease-19 (COVID-19) chest Computed Tomography (CT) scans, MUsculoskeletal RAdiographs (MURA) X-ray images, and cholesterol levels.

【11】 Improving Accuracy of Permutation DAG Search using Best Order Score Search 标题:利用最优序分数搜索提高置换DAG搜索精度 链接:https://arxiv.org/abs/2108.10141

作者:Joseph D. Ramsey 机构:AI] 17 Aug 20 2 1Improving Accuracy of Permutation DAG Search using Best OrderScore SearchJoseph RamseyCarnegie Mellon Universityjdramsey 备注:25 pages, 12 tables 摘要:稀疏排列(SP)算法是精确的,但在实际应用中仅限于9个变量;贪婪稀疏置换(GSP)算法速度较快,但理论上较弱。可以给出一个折衷方案,即最佳顺序分数搜索,它可以提供与SP一样精确的结果,但对于更大、更密集的图形。BOSS(最佳顺序分数搜索)更准确,原因有两个:(a)它假设“野蛮忠诚”假设,比忠诚弱,(b)它使用不同于GSP使用的深度优先遍历的置换遍历,通过依次获取每个变量并将其移动到排列中优化模型分数的位置来获得。在线性、高斯数据的性能方面,将BOSS与文献中的几篇相关论文进行了比较。在所有情况下,通过适当的参数设置,相对于竞争性方法,BOSS的精度都会大大提高。在测试的配置中,具有60个变量的模型是可行的,在合理的时间内,大样本的平均阶数约为12,具有近乎完美的精度,而具有4个平均阶数的稀疏模型在笔记本电脑上具有大约300个变量,同样具有近乎完美的精度。混合连续离散和所有离散数据集也进行了测试。混合数据分析表明,在相同分数的较高深度,BOSS优于GES的优势更为明显;离散数据分析显示,与得分相同的GES相比,BOSS的优势非常小,可能还不足以让人喜欢它。 摘要:The Sparsest Permutation (SP) algorithm is accurate but limited to about 9 variables in practice; the Greedy Sparest Permutation (GSP) algorithm is faster but less weak theoretically. A compromise can be given, the Best Order Score Search, which gives results as accurate as SP but for much larger and denser graphs. BOSS (Best Order Score Search) is more accurate for two reason: (a) It assumes the "brute faithfuness" assumption, which is weaker than faithfulness, and (b) it uses a different traversal of permutations than the depth first traversal used by GSP, obtained by taking each variable in turn and moving it to the position in the permutation that optimizes the model score. Results are given comparing BOSS to several related papers in the literature in terms of performance, for linear, Gaussian data. In all cases, with the proper parameter settings, accuracy of BOSS is lifted considerably with respect to competing approaches. In configurations tested, models with 60 variables are feasible with large samples out to about an average degree of 12 in reasonable time, with near-perfect accuracy, and sparse models with an average degree of 4 are feasible out to about 300 variables on a laptop, again with near-perfect accuracy. Mixed continuous discrete and all-discrete datasets were also tested. The mixed data analysis showed advantage for BOSS over GES more apparent at higher depths with the same score; the discrete data analysis showed a very small advantage for BOSS over GES with the same score, perhaps not enough to prefer it.

【12】 TRAPDOOR: Repurposing backdoors to detect dataset bias in machine learning-based genomic analysis 标题:陷门:重新调整后门的用途以检测基于机器学习的基因组分析中的数据集偏差 链接:https://arxiv.org/abs/2108.10132

作者:Esha Sarkar,Michail Maniatakos 机构:NYU Tandon School of Engineering, Brooklyn, New York, USA, Center for Cybersecurity, New York University Abu Dhabi, Abu Dhabi, UAE 摘要:机器学习(ML)在图像、语音、文本和数据分析等应用中取得了前所未有的性能。利用ML了解基因突变(基因组学)的基本模式具有深远的意义,不仅可以克服诊断缺陷,还可以设计癌症等威胁生命的疾病的治疗方案。ML算法的成功和可持续性取决于收集和用于训练的数据的质量和多样性。在这种数据集中,群体(种族群体、性别群体等)的代表性不足可能导致对某些群体的预测不准确,从而进一步加剧系统性歧视问题。在这项工作中,我们提出了陷门(TRAPDOOR),这是一种识别有偏差数据集的方法,通过重新利用一种主要用于邪恶目的的技术:神经网络后门(neuralnetworkbackdoors)。我们认为一个典型的协作学习设置的基因组供应链,其中数据可能来自医院,合作项目,或研究机构的中心云,而不知道偏见的敏感群体。在这种情况下,我们开发了一种方法来泄漏集体数据的潜在偏差信息,而不妨碍使用面向基因组应用的ML后门的真正性能。使用真实世界的癌症数据集,我们分析了对白人个体已经存在的偏见的数据集,并在数据集中人为引入了偏见,我们的实验结果表明,陷门可以100%准确地检测数据集偏见的存在,此外,还可以通过以较小的误差恢复百分比来提取偏差的程度。 摘要:Machine Learning (ML) has achieved unprecedented performance in several applications including image, speech, text, and data analysis. Use of ML to understand underlying patterns in gene mutations (genomics) has far-reaching results, not only in overcoming diagnostic pitfalls, but also in designing treatments for life-threatening diseases like cancer. Success and sustainability of ML algorithms depends on the quality and diversity of data collected and used for training. Under-representation of groups (ethnic groups, gender groups, etc.) in such a dataset can lead to inaccurate predictions for certain groups, which can further exacerbate systemic discrimination issues. In this work, we propose TRAPDOOR, a methodology for identification of biased datasets by repurposing a technique that has been mostly proposed for nefarious purposes: Neural network backdoors. We consider a typical collaborative learning setting of the genomics supply chain, where data may come from hospitals, collaborative projects, or research institutes to a central cloud without awareness of bias against a sensitive group. In this context, we develop a methodology to leak potential bias information of the collective data without hampering the genuine performance using ML backdooring catered for genomic applications. Using a real-world cancer dataset, we analyze the dataset with the bias that already existed towards white individuals and also introduced biases in datasets artificially, and our experimental result show that TRAPDOOR can detect the presence of dataset bias with 100% accuracy, and furthermore can also extract the extent of bias by recovering the percentage with a small error.

【13】 A study on Machine Learning Approaches for Player Performance and Match Results Prediction 标题:球员表现和比赛结果预测的机器学习方法研究 链接:https://arxiv.org/abs/2108.10125

作者:Harsh Mittal,Deepak Rikhari,Jitendra Kumar,Ashutosh Kumar Singh 机构:Department of Computer Applications, National Institute of Technology, Kurukshetra, Haryana , Tiruchirappalli, Tamilnadu 摘要:板球无疑是世界上最受欢迎的运动之一。随着我们在机器学习领域的进步,预测板球比赛的结果已经成为一个基本问题。多个研究人员试图预测板球比赛或锦标赛的结果,或预测比赛期间球员的表现,或根据球员当前的表现、状态、士气等预测应选择的球员。使用机器学习和人工智能技术,牢记大量细节,特征、特征和参数。我们将讨论其中一些技术,并对这些技术进行简要比较。 摘要:Cricket is unarguably one of the most popular sports in the world. Predicting the outcome of a cricket match has become a fundamental problem as we are advancing in the field of machine learning. Multiple researchers have tried to predict the outcome of a cricket match or a tournament, or to predict the performance of players during a match, or to predict the players who should be selected as per their current performance, form, morale, etc. using machine learning and artificial intelligence techniques keeping in mind extensive detailing, features, and parameters. We discuss some of these techniques along with a brief comparison among these techniques.

【14】 Integrating Transductive And Inductive Embeddings Improves Link Prediction Accuracy 标题:集成感应式和感应式嵌入可提高链接预测精度 链接:https://arxiv.org/abs/2108.10108

作者:Chitrank Gupta,Yash Jain,Abir De,Soumen Chakrabarti 机构:IIT Bombay, India 备注:5 Pages, Accepted by CIKM 2021 摘要:近年来,归纳图嵌入模型、图神经网络(GNN)在在线社交网络中的链接预测(LP)方面变得越来越精确。这类网络的性能在很大程度上取决于输入节点的特性,这些特性因网络和应用程序而异。选择合适的节点功能仍然取决于应用程序,通常是一个悬而未决的问题。此外,由于隐私和道德问题,个性化节点功能的使用往往受到限制。事实上,来自在线社交网络的许多公开数据不包含任何节点特征(例如人口统计)。在这项工作中,我们提供了一个全面的实验分析,该分析表明,利用一种转换技术(例如Node2Vec)来获得初始节点表示,然后使用一种归纳节点嵌入技术,可以显著提高链路预测的准确性。我们证明,对于各种GNN变体,从Node2Vec获得的节点表示向量可以作为GNN的高质量输入特征,从而提高LP性能。 摘要:In recent years, inductive graph embedding models, \emph{viz.}, graph neural networks (GNNs) have become increasingly accurate at link prediction (LP) in online social networks. The performance of such networks depends strongly on the input node features, which vary across networks and applications. Selecting appropriate node features remains application-dependent and generally an open question. Moreover, owing to privacy and ethical issues, use of personalized node features is often restricted. In fact, many publicly available data from online social network do not contain any node features (e.g., demography). In this work, we provide a comprehensive experimental analysis which shows that harnessing a transductive technique (e.g., Node2Vec) for obtaining initial node representations, after which an inductive node embedding technique takes over, leads to substantial improvements in link prediction accuracy. We demonstrate that, for a wide variety of GNN variants, node representation vectors obtained from Node2Vec serve as high quality input features to GNNs, thereby improving LP performance.

【15】 Distilling Neuron Spike with High Temperature in Reinforcement Learning Agents 标题:强化学习智能体中神经元棘波的高温提取 链接:https://arxiv.org/abs/2108.10078

作者:Ling Zhang,Jian Cao,Yuan Zhang,Bohan Zhou,Shuo Feng 机构: School of Software and Microelectronics, Peking University, Beijing, China. 备注:7 pages, 5 figures, conference 摘要:与深度神经网络(DNN)相比,尖峰神经网络(SNN)具有更快的处理速度、更低的能耗和更高的生物可解释性,有望接近强人工智能。强化学习类似于生物学学习。研究SNN与RL的结合具有重要意义。提出了基于STBP的尖峰蒸馏网络(SDN)强化学习方法。该方法采用蒸馏法,有效地避免了STBP算法的缺点,在分类上可以达到SOTA性能,并且可以得到更小、更快收敛和更低功耗的SNN强化学习模型。实验表明,该方法比传统的SNN强化学习和DNN强化学习方法收敛速度快,约快1000个历元,获得的SNN比DNN小200倍。我们还将SDN部署到PKU nc64c芯片上,证明SDN的功耗比DNN低,在大规模设备上SDN的功耗比DNN低600多倍。SDN提供了一种新的SNN强化学习方式,并能实现SOTA性能,证明了SNN强化学习进一步发展的可能性。 摘要:Spiking neural network (SNN), compared with depth neural network (DNN), has faster processing speed, lower energy consumption and more biological interpretability, which is expected to approach Strong AI. Reinforcement learning is similar to learning in biology. It is of great significance to study the combination of SNN and RL. We propose the reinforcement learning method of spike distillation network (SDN) with STBP. This method uses distillation to effectively avoid the weakness of STBP, which can achieve SOTA performance in classification, and can obtain a smaller, faster convergence and lower power consumption SNN reinforcement learning model. Experiments show that our method can converge faster than traditional SNN reinforcement learning and DNN reinforcement learning methods, about 1000 epochs faster, and obtain SNN 200 times smaller than DNN. We also deploy SDN to the PKU nc64c chip, which proves that SDN has lower power consumption than DNN, and the power consumption of SDN is more than 600 times lower than DNN on large-scale devices. SDN provides a new way of SNN reinforcement learning, and can achieve SOTA performance, which proves the possibility of further development of SNN reinforcement learning.

【16】 Dynamic Neural Network Architectural and Topological Adaptation and Related Methods -- A Survey 标题:动态神经网络结构和拓扑自适应及其相关方法综述 链接:https://arxiv.org/abs/2108.10066

作者:Lorenz Kummer 机构:Computer Science Department †, University of Vienna 备注:12 pages, preprint 摘要:深层神经网络(DNN)中的训练和推理,由于结构复杂性和数据集大小的稳步增加,导致了减少DNN训练和推理的时间和空间需求的策略的发展,这对于在资源受限的计算环境中进行训练或推理是时间关键型应用程序的一部分的场景尤为重要。在这项调查中,我们的目的是提供最先进的技术(SOTA)的总体概述和分类,以减少DNN训练和推理的时间和空间复杂性,特别关注体系结构的适应。 摘要:Training and inference in deep neural networks (DNNs) has, due to a steady increase in architectural complexity and data set size, lead to the development of strategies for reducing time and space requirements of DNN training and inference, which is of particular importance in scenarios where training takes place in resource constrained computation environments or inference is part of a time critical application. In this survey, we aim to provide a general overview and categorization of state-of-the-art (SOTA) of techniques to reduced DNN training and inference time and space complexities with a particular focus on architectural adaptions.

【17】 EEG-based Classification of Drivers Attention using Convolutional Neural Network 标题:基于卷积神经网络的驾驶员注意力脑电分类 链接:https://arxiv.org/abs/2108.10062

作者:Fred Atilla,Maryam Alimardani 机构:Department of Cognitive Science and Artificial Intelligence, Tilburg University, Tilburg, Netherlands 备注:Accepted paper at ICHMS2021 摘要:准确检测驾驶员的注意力状态有助于开发实时应对意外危险的辅助技术,从而提高道路安全。这项研究比较了在参与者大脑活动中训练的几种注意量词的表现。参与者在沉浸式模拟器中执行驾驶任务,汽车随机偏离巡航车道。他们必须纠正偏差,他们的反应时间被视为注意力水平的指标。参与者分两节课重复该任务;在一个疗程中,他们收到了动觉反馈,而在另一个疗程中,他们没有收到反馈。利用他们的脑电信号,我们训练了三个注意分类器;使用EEG谱带功率的支持向量机(SVM)和使用谱特征或原始EEG数据的卷积神经网络(CNN)。我们的结果表明,根据动觉反馈获得的原始EEG数据训练的CNN模型达到了最高的准确率(89%)。虽然使用参与者自己的大脑活动来训练模型的效果最好,但跨学科转移学习的表现仍然很高(75%),显示出无校准脑-机接口(BCI)系统的前景。我们的研究结果表明,CNN和原始EEG信号可以有效地训练被动BCI进行实时注意分类。 摘要:Accurate detection of a drivers attention state can help develop assistive technologies that respond to unexpected hazards in real time and therefore improve road safety. This study compares the performance of several attention classifiers trained on participants brain activity. Participants performed a driving task in an immersive simulator where the car randomly deviated from the cruising lane. They had to correct the deviation and their response time was considered as an indicator of attention level. Participants repeated the task in two sessions; in one session they received kinesthetic feedback and in another session no feedback. Using their EEG signals, we trained three attention classifiers; a support vector machine (SVM) using EEG spectral band powers, and a Convolutional Neural Network (CNN) using either spectral features or the raw EEG data. Our results indicated that the CNN model trained on raw EEG data obtained under kinesthetic feedback achieved the highest accuracy (89%). While using a participants own brain activity to train the model resulted in the best performances, inter-subject transfer learning still performed high (75%), showing promise for calibration-free Brain-Computer Interface (BCI) systems. Our findings show that CNN and raw EEG signals can be employed for effective training of a passive BCI for real-time attention classification.

【18】 Remote Sensing and Machine Learning for Food Crop Production Data in Africa Post-COVID-19 标题:冠状病毒后非洲粮食作物生产数据的遥感和机器学习 链接:https://arxiv.org/abs/2108.10054

作者:Racine Ly,Khadim Dia,Mariam Diallo 机构:AKADEMIYA, Kigali, Rwanda 备注:This chapter has been submitted to the Annual Trends and Outlook Report (ATOR, 2021). The ATOR is a flagship report of the Regional Strategic Analysis and Knowledge Support System (ReSAKSS) program at AKADEMIYA2063. The chapter has 22 pages, 14 images, 9 tables, and 36 references 摘要:在农业部门,2019冠状病毒疾病威胁到该地区的严重粮食安全危机,粮食供应链中断,农业生产预计将收缩2.6%至7%。从粮食作物生产方面看,旅行禁令和边境关闭、延迟接收和使用进口种子、化肥和农药等农业投入可能导致粮食作物生产表现不佳。流动限制措施带来的另一层破坏是农业工人的短缺,主要是季节性工人。封锁措施和边境封锁限制了季节性工人准时到达农场进行种植和收割活动。此外,大多数进口的农业投入品都是空运的,这一流行病对空运造成了严重影响。这种运输中断也会对粮食作物生产系统产生负面影响。本章评估了2020年——收获期之前——所有非洲地区以及玉米、木薯、水稻和小麦等四种主食的粮食作物产量水平。利用从卫星图像检索的生物地球物理遥感数据和机器学习人工神经网络(ANN)技术相结合,预测生产水平。以遥感产品为输入变量,人工神经网络为预测建模框架。输入的遥感产品包括归一化植被指数(NDVI)、白天地表温度(LST)、降雨数据和农田蒸散量(ET)。输出地图和数据在基于网络的平台AAgWa(非洲农业观察,www.AAgWa.org)上公开,以便于决策者、决策者和其他利益相关者获取此类信息。 摘要:In the agricultural sector, the COVID-19 threatens to lead to a severe food security crisis in the region, with disruptions in the food supply chain and agricultural production expected to contract between 2.6% and 7%. From the food crop production side, the travel bans and border closures, the late reception and the use of agricultural inputs such as imported seeds, fertilizers, and pesticides could lead to poor food crop production performances. Another layer of disruption introduced by the mobility restriction measures is the scarcity of agricultural workers, mainly seasonal workers. The lockdown measures and border closures limit seasonal workers' availability to get to the farm on time for planting and harvesting activities. Moreover, most of the imported agricultural inputs travel by air, which the pandemic has heavily impacted. Such transportation disruptions can also negatively affect the food crop production system. This chapter assesses food crop production levels in 2020 -- before the harvesting period -- in all African regions and four staples such as maize, cassava, rice, and wheat. The production levels are predicted using the combination of biogeophysical remote sensing data retrieved from satellite images and machine learning artificial neural networks (ANNs) technique. The remote sensing products are used as input variables and the ANNs as the predictive modeling framework. The input remote sensing products are the Normalized Difference Vegetation Index (NDVI), the daytime Land Surface Temperature (LST), rainfall data, and agricultural lands' Evapotranspiration (ET). The output maps and data are made publicly available on a web-based platform, AAgWa (Africa Agriculture Watch, www.aagwa.org), to facilitate access to such information to policymakers, deciders, and other stakeholders.

【19】 Integrating LSTMs and GNNs for COVID-19 Forecasting 标题:集成LSTMS和GNNs进行冠状病毒预测 链接:https://arxiv.org/abs/2108.10052

作者:Nathan Sesti,Juan Jose Garau-Luis,Edward Crawley,Bruce Cameron 机构: Mas-sachusetts Institute of Technology 摘要:2019冠状病毒疾病的传播与图神经网络(GNN)的兴起相一致,导致了一些研究建议它们更好地预测大流行的演变。许多这样的模型还包括长短时记忆(LSTM)网络,这是时间序列预测的常用工具。在这项工作中,我们通过在LSTM门内实现GNN和利用空间信息,进一步研究这两种方法的集成。此外,我们还引入了一个跳过连接,这对于联合捕获数据中的空间和时间模式至关重要。我们验证了我们2019冠状病毒疾病的新的预测模型,在过去的472天里,37个欧洲国家的数据,并显示出与基于平均绝对缩放误差(MASE)的最先进的图时间序列模型相比优越的性能。这一研究领域在政策制定方面有着重要的应用,我们分析了其在大流行资源控制方面的潜力。 摘要:The spread of COVID-19 has coincided with the rise of Graph Neural Networks (GNNs), leading to several studies proposing their use to better forecast the evolution of the pandemic. Many such models also include Long Short Term Memory (LSTM) networks, a common tool for time series forecasting. In this work, we further investigate the integration of these two methods by implementing GNNs within the gates of an LSTM and exploiting spatial information. In addition, we introduce a skip connection which proves critical to jointly capture the spatial and temporal patterns in the data. We validate our daily COVID-19 new cases forecast model on data of 37 European nations for the last 472 days and show superior performance compared to state-of-the-art graph time series models based on mean absolute scaled error (MASE). This area of research has important applications to policy-making and we analyze its potential for pandemic resource control.

【20】 Discovering Spatial Relationships by Transformers for Domain Generalization 标题:基于变换器域综合的空间关系发现 链接:https://arxiv.org/abs/2108.10046

作者:Cuicui Kang,Karthik Nandakumar 摘要:随着图像数据多样性的迅速增加,领域泛化问题日益受到关注。虽然领域泛化是一个具有挑战性的问题,但由于计算机视觉中人工智能技术的快速发展,领域泛化已经取得了巨大的发展。这些高级算法大多是基于卷积神经网络(CNN)的深层结构提出的。然而,尽管CNN具有很强的识别能力,但由于对CNN过滤器的响应大多是局部的,因此它们在建模图像中不同位置之间的关系方面做得很差。由于这些局部和全局空间关系的特征是区分所考虑的对象,因此它们在提高针对域间隙的泛化能力方面起着关键作用。为了获得对象-部分关系以获得更好的领域泛化,本文提出使用自我注意模型。然而,注意力模型是针对序列提出的,在二维图像的鉴别特征提取方面并不擅长。考虑到这一点,我们提出了一种混合结构来发现这些局部特征之间的空间关系,并推导出一种复合表示法,该表示法对区分性特征及其关系进行编码,以提高领域泛化能力。通过对三个著名基准的评估,证明了使用所提出的方法对图像特征之间的关系进行建模的好处,并实现了最先进的领域泛化性能。更具体地说,在PACS和Office Home数据库上,所提出的算法分别比最先进的算法高出2.2\%$和3.4\%$。 摘要:Due to the rapid increase in the diversity of image data, the problem of domain generalization has received increased attention recently. While domain generalization is a challenging problem, it has achieved great development thanks to the fast development of AI techniques in computer vision. Most of these advanced algorithms are proposed with deep architectures based on convolution neural nets (CNN). However, though CNNs have a strong ability to find the discriminative features, they do a poor job of modeling the relations between different locations in the image due to the response to CNN filters are mostly local. Since these local and global spatial relationships are characterized to distinguish an object under consideration, they play a critical role in improving the generalization ability against the domain gap. In order to get the object parts relationships to gain better domain generalization, this work proposes to use the self attention model. However, the attention models are proposed for sequence, which are not expert in discriminate feature extraction for 2D images. Considering this, we proposed a hybrid architecture to discover the spatial relationships between these local features, and derive a composite representation that encodes both the discriminative features and their relationships to improve the domain generalization. Evaluation on three well-known benchmarks demonstrates the benefits of modeling relationships between the features of an image using the proposed method and achieves state-of-the-art domain generalization performance. More specifically, the proposed algorithm outperforms the state-of-the-art by $2.2\%$ and $3.4\%$ on PACS and Office-Home databases, respectively.

【21】 Deep Relational Metric Learning 标题:深度关系度量学习 链接:https://arxiv.org/abs/2108.10026

作者:Wenzhao Zheng,Borui Zhang,Jiwen Lu,Jie Zhou 机构:Department of Automation, Tsinghua University, China, Beijing National Research Center for Information Science and Technology, China 备注:Accepted to ICCV 2021. Source code available at this https URL 摘要:提出了一种用于图像聚类和检索的深度关系度量学习(DRML)框架。大多数现有的深度度量学习方法学习嵌入空间,其总体目标是增加类间距离和减少类内距离。然而,传统的度量学习损失通常会抑制组内变化,这可能有助于识别未知类的样本。为了解决这个问题,我们建议自适应地学习从不同方面表征图像的特征集合,以建模类内和类内分布。我们进一步使用关系模块来捕获集合中每个特征之间的相关性,并构造一个图来表示图像。然后,我们在图上执行关系推理来集成集合,并获得关系感知嵌入来度量相似度。在广泛使用的CUB-200-2011、Cars196和斯坦福在线产品数据集上进行的大量实验表明,我们的框架改进了现有的深度度量学习方法,并取得了非常有竞争力的结果。 摘要:This paper presents a deep relational metric learning (DRML) framework for image clustering and retrieval. Most existing deep metric learning methods learn an embedding space with a general objective of increasing interclass distances and decreasing intraclass distances. However, the conventional losses of metric learning usually suppress intraclass variations which might be helpful to identify samples of unseen classes. To address this problem, we propose to adaptively learn an ensemble of features that characterizes an image from different aspects to model both interclass and intraclass distributions. We further employ a relational module to capture the correlations among each feature in the ensemble and construct a graph to represent an image. We then perform relational inference on the graph to integrate the ensemble and obtain a relation-aware embedding to measure the similarities. Extensive experiments on the widely-used CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate that our framework improves existing deep metric learning methods and achieves very competitive results.

【22】 QDEF and Its Approximations in OBDM 标题:OBDM中的QDEF及其近似 链接:https://arxiv.org/abs/2108.10021

作者:Gianluca Cima,Federico Croce,Maurizio Lenzerini 机构: CNRS & University of Bordeaux, Bordeaux, France, Sapienza University of Rome, Rome, Italy 备注:A more compact version of this paper will be published at the proceedings of the 30th ACM International Conference on Information and Knowledge Management. The associated DOI is: this https URL 摘要:给定一个输入数据集(即一组元组),基于本体的数据管理(OBDM)中的查询可定义性相当于在本体上找到一个查询,其某些答案与给定数据集中的元组一致。我们将这种查询称为与OBDM系统相关的数据集的特征化。我们的第一个贡献是从回忆(完全表征)和准确度(声音表征)两个方面提出完美表征的近似值。第二个贡献是对三个计算问题进行彻底的复杂性分析,即验证(检查给定查询是否是给定数据集的完美或近似表征)、存在性(检查给定数据集是否存在完美或最佳近似表征),和计算(计算给定数据集的完美或最佳近似特征)。 摘要:Given an input dataset (i.e., a set of tuples), query definability in Ontology-based Data Management (OBDM) amounts to find a query over the ontology whose certain answers coincide with the tuples in the given dataset. We refer to such a query as a characterization of the dataset with respect to the OBDM system. Our first contribution is to propose approximations of perfect characterizations in terms of recall (complete characterizations) and precision (sound characterizations). A second contribution is to present a thorough complexity analysis of three computational problems, namely verification (check whether a given query is a perfect, or an approximated characterization of a given dataset), existence (check whether a perfect, or a best approximated characterization of a given dataset exists), and computation (compute a perfect, or best approximated characterization of a given dataset).

【23】 BiaSwap: Removing dataset bias with bias-tailored swapping augmentation 标题:BiaSwp:通过针对偏差量身定制的交换增强消除数据集偏差 链接:https://arxiv.org/abs/2108.10008

作者:Eungyeup Kim,Jihyeon Lee,Jaegul Choo 机构:KAIST 备注:Accepted to ICCV'21 摘要:深度神经网络通常根据数据集中固有的虚假相关性做出决策,无法在无偏数据分布中推广。尽管以前的方法预先定义了数据集偏差的类型以防止网络学习它,但在真实数据集中识别偏差类型通常是禁止的。本文提出了一种新的基于偏差裁剪的增广方法BiaSwap,用于学习基于Debias的表示,而无需监督偏差类型。假设偏差对应于易于学习的属性,我们根据有偏差分类器能够利用它们作为捷径的程度对训练图像进行排序,并以无监督的方式将其划分为偏差引导样本和偏差相反样本。然后,我们将图像翻译模型的风格转换模块与这种有偏分类器的类激活映射相结合,从而能够初步转换分类器学习到的有偏属性。因此,给定偏置引导和偏置反向对,BiaSwap生成偏置交换图像,该图像包含偏置反向图像中的偏置属性,同时保留偏置引导图像中与偏置无关的属性。考虑到这些增强的图像,BiaSwap展示了相对于合成和真实世界数据集的现有基线进行借记的优势。即使没有对偏差的仔细监督,BiaSwap在无偏差和偏差引导样本上都取得了显著的性能,这意味着模型的泛化能力得到了提高。 摘要:Deep neural networks often make decisions based on the spurious correlations inherent in the dataset, failing to generalize in an unbiased data distribution. Although previous approaches pre-define the type of dataset bias to prevent the network from learning it, recognizing the bias type in the real dataset is often prohibitive. This paper proposes a novel bias-tailored augmentation-based approach, BiaSwap, for learning debiased representation without requiring supervision on the bias type. Assuming that the bias corresponds to the easy-to-learn attributes, we sort the training images based on how much a biased classifier can exploits them as shortcut and divide them into bias-guiding and bias-contrary samples in an unsupervised manner. Afterwards, we integrate the style-transferring module of the image translation model with the class activation maps of such biased classifier, which enables to primarily transfer the bias attributes learned by the classifier. Therefore, given the pair of bias-guiding and bias-contrary, BiaSwap generates the bias-swapped image which contains the bias attributes from the bias-contrary images, while preserving bias-irrelevant ones in the bias-guiding images. Given such augmented images, BiaSwap demonstrates the superiority in debiasing against the existing baselines over both synthetic and real-world datasets. Even without careful supervision on the bias, BiaSwap achieves a remarkable performance on both unbiased and bias-guiding samples, implying the improved generalization capability of the model.

【24】 Credit Card Fraud Detection using Machine Learning: A Study 标题:基于机器学习的信用卡诈骗检测研究 链接:https://arxiv.org/abs/2108.10005

作者:Pooja Tiwari,Simran Mehta,Nishtha Sakhuja,Jitendra Kumar,Ashutosh Kumar Singh 机构:Department of Computer Applications, National Institute of Technology, Kurukshetra, Haryana , Tiruchirappalli, Tamilnadu 摘要:随着世界迅速走向数字化,货币交易变得无现金,信用卡的使用迅速增加。与之相关的欺诈活动也在不断增加,这给金融机构带来了巨大的损失。因此,我们需要从非欺诈性交易中分析和检测欺诈性交易。在本文中,我们提出了一个全面的审查,各种方法用于检测信用卡欺诈。这些方法包括隐马尔可夫模型、决策树、逻辑回归、支持向量机(SVM)、遗传算法、神经网络、随机森林、贝叶斯信念网络。对各种技术进行了综合分析。最后,我们总结了本文的优点和缺点,正如在各自的论文中所述。 摘要:As the world is rapidly moving towards digitization and money transactions are becoming cashless, the use of credit cards has rapidly increased. The fraud activities associated with it have also been increasing which leads to a huge loss to the financial institutions. Therefore, we need to analyze and detect the fraudulent transaction from the non-fraudulent ones. In this paper, we present a comprehensive review of various methods used to detect credit card fraud. These methodologies include Hidden Markov Model, Decision Trees, Logistic Regression, Support Vector Machines (SVM), Genetic algorithm, Neural Networks, Random Forests, Bayesian Belief Network. A comprehensive analysis of various techniques is presented. We conclude the paper with the pros and cons of the same as stated in the respective papers.

【25】 MS-DARTS: Mean-Shift Based Differentiable Architecture Search 标题:MS-DARTS:基于均值漂移的可区分结构搜索 链接:https://arxiv.org/abs/2108.09996

作者:Jun-Wei Hsieh,Ming-Ching Chang,Ping-Yang Chen,Cheng-Han Chou,Chih-Sheng Huang 机构:College of Artificial Intelligence and Green Energy, National Yang Ming Chiao Tung University, University at Albany - SUNY, Department of Computer Science, Elan Corporation 备注:14pages 摘要:可微体系结构搜索(DARTS)是一种有效的基于连续松弛的网络体系结构搜索(NAS)方法,搜索成本低。它已经引起了自动ML研究的极大关注,并成为NAS中最有用的范例之一。尽管DART可以通过更好地控制复杂参数而产生比传统NAS方法更高的效率,但在对连续体系结构进行离散化时,在产生恶化的体系结构时,往往会遇到稳定性问题。我们观察到,在DART的最后离散化步骤中,有效性显著丧失,导致性能急剧下降。为了解决这个问题,我们提出了一种基于均值漂移的DART(MS-DART),以提高基于采样和扰动的稳定性。我们的方法可以通过在合适的带宽内平滑丢失的地形和采样结构参数来提高DART的稳定性和准确性。我们研究了均值漂移方法的收敛性,以及影响稳定性和准确性的带宽选择效应。对CIFAR-10、CIFAR-100和ImageNet进行的评估表明,MS-DARTS比其他最先进的NAS方法具有更高的性能,并且搜索成本更低。 摘要:Differentiable Architecture Search (DARTS) is an effective continuous relaxation-based network architecture search (NAS) method with low search cost. It has attracted significant attentions in Auto-ML research and becomes one of the most useful paradigms in NAS. Although DARTS can produce superior efficiency over traditional NAS approaches with better control of complex parameters, oftentimes it suffers from stabilization issues in producing deteriorating architectures when discretizing the continuous architecture. We observed considerable loss of validity causing dramatic decline in performance at this final discretization step of DARTS. To address this issue, we propose a Mean-Shift based DARTS (MS-DARTS) to improve stability based on sampling and perturbation. Our approach can improve bot the stability and accuracy of DARTS, by smoothing the loss landscape and sampling architecture parameters within a suitable bandwidth. We investigate the convergence of our mean-shift approach, together with the effects of bandwidth selection that affects stability and accuracy. Evaluations performed on CIFAR-10, CIFAR-100, and ImageNet show that MS-DARTS archives higher performance over other state-of-the-art NAS methods with reduced search cost.

【26】 Farsighted Probabilistic Sampling based Local Search for (Weighted) Partial MaxSAT 标题:基于远见概率抽样的局部搜索(加权)部分最大卫星 链接:https://arxiv.org/abs/2108.09988

作者:Jiongzhi Zheng,Jianrong Zhou,Kun He 机构:School of Computer Science, Huazhong University of Science and Technology, China, Institute of Artificial Intelligence, Huazhong University of Science and Technology, China 备注:Submitted to AAAI 2022 摘要:部分MaxSAT(PMS)和加权部分MaxSAT(WPMS)都是MaxSAT的典型组合问题的实用推广。在这项工作中,我们提出了一种有效的基于前瞻性概率抽样的局部搜索算法FPS来解决这两个问题,称为(W)PMS。FPS算法用所提出的有远见的局部搜索策略取代了在现有(W)PMS局部搜索算法中广泛使用的每个迭代步骤翻转单个变量的机制,并提供了更高质量的局部最优解。前瞻性策略采用了概率抽样技术,使算法能够广泛而有效地向前看。这样,FPS可以提供更多更好的搜索方向,并在不降低效率的情况下提高性能。在最近四年MaxSAT评估的不完整轨道上对(W)PMS问题的所有基准进行的大量实验表明,我们的方法在解决PMS和WPMS问题方面明显优于最先进的局部搜索算法SATLike3.0。此外,我们还与SATLike的扩展解算器SATLike-c进行了比较,SATLike-c是最近MaxSAT评估(MSE2021)中四个不完整轨道类别(PM和WPMS类别,各有两个时间限制)中三个类别的冠军。我们将SATLike-c中的局部搜索组件替换为所提出的前瞻性采样局部搜索方法,得到的解算器FPS-c在解决PMS和WPMS问题方面也优于SATLike-c。 摘要:Partial MaxSAT (PMS) and Weighted Partial MaxSAT (WPMS) are both practical generalizations to the typical combinatorial problem of MaxSAT. In this work, we propose an effective farsighted probabilistic sampling based local search algorithm called FPS for solving these two problems, denoted as (W)PMS. The FPS algorithm replaces the mechanism of flipping a single variable per iteration step, that is widely used in existing (W)PMS local search algorithms, with the proposed farsighted local search strategy, and provides higher-quality local optimal solutions. The farsighted strategy employs the probabilistic sampling technique that allows the algorithm to look-ahead widely and efficiently. In this way, FPS can provide more and better search directions and improve the performance without reducing the efficiency. Extensive experiments on all the benchmarks of (W)PMS problems from the incomplete track of recent four years of MaxSAT Evaluations demonstrate that our method significantly outperforms SATLike3.0, the state-of-the-art local search algorithm, for solving both the PMS and WPMS problems. We furthermore do comparison with the extended solver of SATLike, SATLike-c, which is the champion of three categories among the total four (PMS and WPMS categories, each associated with two time limits) of the incomplete track in the recent MaxSAT Evaluation (MSE2021). We replace the local search component in SATLike-c with the proposed farsighted sampling local search approach, and the resulting solver FPS-c also outperforms SATLike-c for solving both the PMS and WPMS problems.

【27】 TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment 标题:TACO:用于视频-文本对齐的标记感知级联对比学习 链接:https://arxiv.org/abs/2108.09980

作者:Jianwei Yang,Yonatan Bisk,Jianfeng Gao 机构:Microsoft Research, Carnegie Mellon University 备注:Accepted by ICCV 2021 摘要:对比学习被广泛用于训练基于变换器的视觉语言模型,用于视频文本对齐和多模态表示学习。本文提出了一种新的标记感知级联对比学习(TACo)算法,该算法利用两种新技术改进对比学习。第一种是标记感知对比损失,它是通过考虑单词的句法类别来计算的。这是因为观察到,对于视频-文本对,文本中的内容词(如名词和动词)比虚词更可能与视频中的视觉内容对齐。其次,采用级联采样方法生成一组小的硬负示例,用于有效估计多模态融合层的损耗。为了验证TACo的有效性,在我们的实验中,我们对一组下游任务的预训练模型进行了微调,包括文本视频检索(YouCook2、MSR-VTT和ActivityNet)、视频动作步骤定位(CrossTask)、视频动作分段(COIN)。结果表明,与以前的方法相比,我们的模型在不同的实验环境中取得了一致的改进,在YouCook2、MSR-VTT和ActivityNet三个公共文本视频检索基准上建立了最新的技术水平。 摘要:Contrastive learning has been widely used to train transformer-based vision-language models for video-text alignment and multi-modal representation learning. This paper presents a new algorithm called Token-Aware Cascade contrastive learning (TACo) that improves contrastive learning using two novel techniques. The first is the token-aware contrastive loss which is computed by taking into account the syntactic classes of words. This is motivated by the observation that for a video-text pair, the content words in the text, such as nouns and verbs, are more likely to be aligned with the visual contents in the video than the function words. Second, a cascade sampling method is applied to generate a small set of hard negative examples for efficient loss estimation for multi-modal fusion layers. To validate the effectiveness of TACo, in our experiments we finetune pretrained models for a set of downstream tasks including text-video retrieval (YouCook2, MSR-VTT and ActivityNet), video action step localization (CrossTask), video action segmentation (COIN). The results show that our models attain consistent improvements across different experimental settings over previous methods, setting new state-of-the-art on three public text-video retrieval benchmarks of YouCook2, MSR-VTT and ActivityNet.

【28】 Pulse-Width Modulation Neuron Implemented by Single Positive-Feedback Device 标题:由单个正反馈器件实现的脉宽调制神经元 链接:https://arxiv.org/abs/2108.09954

作者:Sung Yun Woo,Dongseok Kwon,Byung-Gook Park,Jong-Ho Lee,Jong-Ho Bae 机构:Bae.) S. Y. Woo, D. Kwon, B.-G. Park and J.-H. Lee are with the Department, of Electrical and Computer Engineering and Inter-University Semiconductor, constant amplitude, thus those are beneficial in terms of, accurate MAC operation [,]. 摘要:提出并论证了正反馈(PF)器件及其实现脉宽调制(PWM)功能的运行方案,分析了实现PWM功能的器件运行机理。通过调整储存在n-浮体(Qn)中的电荷量,浮体的电势随时间线性变化。当Qn达到阈值(Qth)时,PF设备突然打开。从Qn的线性时变特性和Qth的栅偏压依赖性出发,利用单个PF器件成功地获得了包括电压-脉宽转换和硬sigmoid激活函数在内的全功能PWM神经元特性。PWM神经元可以通过使用单个PF器件来实现,因此与先前报道的相比,它有利于极大地减小PWM神经元电路的面积。 摘要:Positive-feedback (PF) device and its operation scheme to implement pulse width modulation (PWM) function was proposed and demonstrated, and the device operation mechanism for implementing PWM function was analyzed. By adjusting the amount of the charge stored in the n- floating body (Qn), the potential of the floating body linearly changes with time. When Qn reaches to a threshold value (Qth), the PF device turns on abruptly. From the linear time-varying property of Qn and the gate bias dependency of Qth, fully functionable PWM neuron properties including voltage to pulse width conversion and hard-sigmoid activation function were successfully obtained from a single PF device. A PWM neuron can be implemented by using a single PF device, thus it is beneficial to extremely reduce the area of a PWM neuron circuit than the previously reported one.

【29】 Voxel-based Network for Shape Completion by Leveraging Edge Generation 标题:利用边缘生成实现形状完成的基于体素的网络 链接:https://arxiv.org/abs/2108.09936

作者:Xiaogang Wang,Marcelo H Ang Jr,Gim Hee Lee 机构:National University of Singapore 备注:ICCV 2021 摘要:深度学习技术在点云完成方面取得了显著的改进,目的是从部分输入中完成缺失的对象形状。然而,由于细粒度细节的过度平滑,大多数现有方法无法恢复真实结构。在本文中,我们通过利用边缘生成(VE-PCN)开发了一个基于体素的点云完成网络。我们首先将点云嵌入到规则的体素网格中,然后借助幻觉的形状边缘生成完整的对象。这种解耦的体系结构与多尺度网格特征学习相结合,能够生成更真实的表面细节。我们在公开的完井数据集上评估了我们的模型,并表明它在数量和质量上都优于现有的最新方法。我们的源代码可在https://github.com/xiaogangw/VE-PCN. 摘要:Deep learning technique has yielded significant improvements in point cloud completion with the aim of completing missing object shapes from partial inputs. However, most existing methods fail to recover realistic structures due to over-smoothing of fine-grained details. In this paper, we develop a voxel-based network for point cloud completion by leveraging edge generation (VE-PCN). We first embed point clouds into regular voxel grids, and then generate complete objects with the help of the hallucinated shape edges. This decoupled architecture together with a multi-scale grid feature learning is able to generate more realistic on-surface details. We evaluate our model on the publicly available completion datasets and show that it outperforms existing state-of-the-art approaches quantitatively and qualitatively. Our source code is available at https://github.com/xiaogangw/VE-PCN.

【30】 Federated Learning Meets Fairness and Differential Privacy 标题:联合学习遇到公平和差异隐私 链接:https://arxiv.org/abs/2108.09932

作者:Manisha Padala,Sankarshan Damle,Sujit Gujar 机构:Machine Learning Lab, International Institute of Information Technology (IIIT), Hyderabad 摘要:深度学习的空前成功引发了从偏见预测到数据隐私等诸多伦理问题。研究人员通过引入公平性指标、联合学习或差异隐私来解决这些问题。首先,这项工作提出了一个道德联合学习模型,同时包含了所有三项措施。对成人、银行和荷兰数据集的实验突出了准确性、公平性和隐私性之间的“经验互动”。 摘要:Deep learning's unprecedented success raises several ethical concerns ranging from biased predictions to data privacy. Researchers tackle these issues by introducing fairness metrics, or federated learning, or differential privacy. A first, this work presents an ethical federated learning model, incorporating all three measures simultaneously. Experiments on the Adult, Bank and Dutch datasets highlight the resulting ``empirical interplay" between accuracy, fairness, and privacy.

【31】 Fluent: An AI Augmented Writing Tool for People who Stutter 标题:流利:一款针对口吃者的人工智能增强写作工具 链接:https://arxiv.org/abs/2108.09918

作者:Bhavya Ghai,Klaus Mueller 机构:Stony Brook University 备注:Accepted to ACM ASSETS 2021 conference 摘要:口吃是一种言语障碍,影响着全世界数百万人的个人和职业生活。为了使自己免于耻辱和歧视,口吃者(PWS)可能会采取不同的策略来掩盖口吃。其中一个常见的策略是单词替换,即个体避免说他们可能结巴的单词,而是使用另一个替代词。这个过程本身会造成压力,增加负担。在这项工作中,我们介绍了Fluent,一种人工智能增强的写作工具,它可以帮助PWS编写脚本,使他们能够说得更流利。Fluent体现了一种新的基于主动学习的方法,用于识别个人可能难以发音的单词。这些词在界面中突出显示。将鼠标悬停在任何此类单词上,Fluent会显示一组具有类似含义但更容易说话的备选单词。用户可以自由接受或忽略这些建议。基于这种用户交互(反馈),Fluent不断改进其分类器,以更好地满足每个用户的个性化需求。我们通过测量10个模拟用户识别难懂单词的能力来评估我们的工具。我们发现,我们的工具可以在20次以下的互动中识别难词,平均准确率超过80%,并且随着反馈的增加,它会不断改进。我们的工具可以用于某些重要的生活场合,如演讲、演示等。该工具的源代码已在github.com/bhavyaghai/Fluent上公开提供。 摘要:Stuttering is a speech disorder which impacts the personal and professional lives of millions of people worldwide. To save themselves from stigma and discrimination, people who stutter (PWS) may adopt different strategies to conceal their stuttering. One of the common strategies is word substitution where an individual avoids saying a word they might stutter on and use an alternative instead. This process itself can cause stress and add more burden. In this work, we present Fluent, an AI augmented writing tool which assists PWS in writing scripts which they can speak more fluently. Fluent embodies a novel active learning based method of identifying words an individual might struggle pronouncing. Such words are highlighted in the interface. On hovering over any such word, Fluent presents a set of alternative words which have similar meaning but are easier to speak. The user is free to accept or ignore these suggestions. Based on such user interaction (feedback), Fluent continuously evolves its classifier to better suit the personalized needs of each user. We evaluated our tool by measuring its ability to identify difficult words for 10 simulated users. We found that our tool can identify difficult words with a mean accuracy of over 80% in under 20 interactions and it keeps improving with more feedback. Our tool can be beneficial for certain important life situations like giving a talk, presentation, etc. The source code for this tool has been made publicly accessible at github.com/bhavyaghai/Fluent.

【32】 Genetic Programming for Manifold Learning: Preserving Local Topology 标题:流形学习的遗传规划:保持局部拓扑 链接:https://arxiv.org/abs/2108.09914

作者:Andrew Lensen,Bing Xue,Mengjie Zhang 机构:This work was supported in part by the Marsden Fund of the New ZealandGovernment under Contracts VUW 19 1 3 and VUW 19 1 4 and the UniversityResearch Fund at Te Herenga Waka–Victoria University of Wellington undergrant number 2 26 16 1 4 16 4 备注:Accepted by IEEE Transactions on Evolutionary Computation, 2021 摘要:在数据集日益庞大的今天,流形学习方法是一种非常宝贵的工具。流形学习算法可以通过保留原始数据最重要结构的非线性变换发现高维数据集的低维表示(嵌入)。最先进的流形学习方法直接优化嵌入,而无需在原始空间和发现的嵌入空间之间进行映射。这使得可解释性——探索性数据分析的关键要求——几乎不可能实现。最近,遗传规划通过将函数映射从原始空间演化到嵌入空间,成为一种非常有前途的流形学习方法。然而,基于遗传编程的流形学习一直难以与其他方法的性能相匹配。在这项工作中,我们提出了一种新的方法来使用遗传规划的流形学习,保持局部拓扑。这有望显著提高局部邻域结构(拓扑)至关重要的任务的性能。我们将我们提出的方法与各种基线流形学习方法进行比较,发现它通常优于其他方法,包括比以前的遗传编程方法有明显的改进。鉴于进化映射的潜在可解释性和可重用性,这些结果尤其有希望。 摘要:Manifold learning methods are an invaluable tool in today's world of increasingly huge datasets. Manifold learning algorithms can discover a much lower-dimensional representation (embedding) of a high-dimensional dataset through non-linear transformations that preserve the most important structure of the original data. State-of-the-art manifold learning methods directly optimise an embedding without mapping between the original space and the discovered embedded space. This makes interpretability - a key requirement in exploratory data analysis - nearly impossible. Recently, genetic programming has emerged as a very promising approach to manifold learning by evolving functional mappings from the original space to an embedding. However, genetic programming-based manifold learning has struggled to match the performance of other approaches. In this work, we propose a new approach to using genetic programming for manifold learning, which preserves local topology. This is expected to significantly improve performance on tasks where local neighbourhood structure (topology) is paramount. We compare our proposed approach with various baseline manifold learning methods and find that it often outperforms other methods, including a clear improvement over previous genetic programming approaches. These results are particularly promising, given the potential interpretability and reusability of the evolved mappings.

【33】 Rate distortion comparison of a few gradient quantizers 标题:几种梯度量化器的率失真比较 链接:https://arxiv.org/abs/2108.09899

作者:Tharindu Adikari 机构:University of Toronto 摘要:本文是在梯度压缩的背景下进行的。梯度压缩是一种流行的技术,用于缓解在使用基于梯度的方法(如随机梯度下降)以分布式方式训练大型机器学习模型时观察到的通信瓶颈。在本文中,假设梯度分量为高斯分布,我们找到了缩放符号和Top-K等梯度量化方案的率失真权衡,并与香农率失真极限进行了比较。还与矢量量化器进行了类似的比较。 摘要:This article is in the context of gradient compression. Gradient compression is a popular technique for mitigating the communication bottleneck observed when training large machine learning models in a distributed manner using gradient-based methods such as stochastic gradient descent. In this article, assuming a Gaussian distribution for the components in gradient, we find the rate distortion trade-off of gradient quantization schemes such as Scaled-sign and Top-K, and compare with the Shannon rate distortion limit. A similar comparison with vector quantizers also is presented.

【34】 DTWSSE: Data Augmentation with a Siamese Encoder for Time Series 标题:DTWSSE:使用暹罗编码器进行时间序列的数据增强 链接:https://arxiv.org/abs/2108.09885

作者:Xinyu Yang,Xinlan Zhang,Zhenguo Zhang,Yahui Zhao,Rongyi Cui 机构:Department of Computer Science and Technology, Yanbian University, Gongyuan Road, Yanji, People’s Republic of China 备注:Accepted as full research paper in APWEB-WAIM 2021 摘要:在现实世界中,对标记时间序列数据的访问往往受到限制,这限制了时间序列分析领域中深度学习模型的性能。数据扩充是解决时间序列数据样本量小、不平衡问题的有效途径。数据扩充的两个关键因素是距离度量和插值方法的选择。SMOTE在时间序列数据上表现不佳,因为它使用欧几里德距离度量并直接在对象上插值。因此,我们提出了一种基于DTW的合成少数过采样技术,使用暹罗编码器进行插值,名为DTWSSE。为了合理地测量时间序列的距离,采用DTW作为距离度量,该方法已被证实是forts的一种有效方法。为了适应DTW度量,我们使用以无监督自训练方式训练的自动编码器进行插值。编码器是一个连体神经网络,用于将时间序列数据从DTW隐藏空间映射到欧氏深度特征空间,解码器用于将深度特征空间映射回DTW隐藏空间。我们在大量不同的平衡或非平衡时间序列数据集上验证了所提出的方法。实验结果表明,该方法能使下游深度学习模型具有更好的性能。 摘要:Access to labeled time series data is often limited in the real world, which constrains the performance of deep learning models in the field of time series analysis. Data augmentation is an effective way to solve the problem of small sample size and imbalance in time series datasets. The two key factors of data augmentation are the distance metric and the choice of interpolation method. SMOTE does not perform well on time series data because it uses a Euclidean distance metric and interpolates directly on the object. Therefore, we propose a DTW-based synthetic minority oversampling technique using siamese encoder for interpolation named DTWSSE. In order to reasonably measure the distance of the time series, DTW, which has been verified to be an effective method forts, is employed as the distance metric. To adapt the DTW metric, we use an autoencoder trained in an unsupervised self-training manner for interpolation. The encoder is a Siamese Neural Network for mapping the time series data from the DTW hidden space to the Euclidean deep feature space, and the decoder is used to map the deep feature space back to the DTW hidden space. We validate the proposed methods on a number of different balanced or unbalanced time series datasets. Experimental results show that the proposed method can lead to better performance of the downstream deep learning model.

【35】 On Quantifying Literals in Boolean Logic and Its Applications to Explainable AI 标题:布尔逻辑中文字的量化及其在可解释人工智能中的应用 链接:https://arxiv.org/abs/2108.09876

作者:Adnan Darwiche,Pierre Marquis 机构:Computer Science Department, UCLA, Los Angeles, CA , USA, CRIL, Universit´e d’Artois & CNRS, Institut Universitaire de France, F-, Lens Cedex, France 备注:To be published in Journal of Artificial Intelligence Research (JAIR) with minor modifications 摘要:量化布尔逻辑是将运算符添加到布尔逻辑中,用于存在和通用量化变量的结果。这扩展了布尔逻辑的范围,实现了几十年来探索的各种应用。文献中也研究了文字(可变状态)的存在量化及其应用。在本文中,我们通过研究通用文字量化及其应用,特别是对可解释人工智能的应用,来补充这一点。我们还提供了一种新的量化语义,讨论了变量/文字和存在/通用量化之间的相互作用。我们进一步确定了可以有效地进行量化的一些布尔公式和电路类。文字量化比变量量化更细粒度,因为后者可以根据前者定义。这导致了以文字量化为原语的量化布尔逻辑的细化。 摘要:Quantified Boolean logic results from adding operators to Boolean logic for existentially and universally quantifying variables. This extends the reach of Boolean logic by enabling a variety of applications that have been explored over the decades. The existential quantification of literals (variable states) and its applications have also been studied in the literature. In this paper, we complement this by studying universal literal quantification and its applications, particularly to explainable AI. We also provide a novel semantics for quantification, discuss the interplay between variable/literal and existential/universal quantification. We further identify some classes of Boolean formulas and circuits on which quantification can be done efficiently. Literal quantification is more fine-grained than variable quantification as the latter can be defined in terms of the former. This leads to a refinement of quantified Boolean logic with literal quantification as its primitive.

【36】 Embodied AI-Driven Operation of Smart Cities: A Concise Review 标题:体现人工智能驱动的智慧城市运营:简评 链接:https://arxiv.org/abs/2108.09823

作者:Farzan Shenavarmasouleh,Farid Ghareh Mohammadi,M. Hadi Amini,Hamid R. Arabnia 机构::Department of Computer Science, Franklin College of arts and sciences, University of Georgia, Athens, GA, USA, : School of Computing & Information Sciences, College of Engineering & Computing, Florida International University, Miami, FL, USA 备注:Cyberphysical Smart Cities Infrastructures: Optimal Operation and Intelligent Decision Making 2021 摘要:智慧城市可以看作是一个由信息和通信技术(ICT)组成的框架。由连接设备组成的智能网络通过传感器收集数据,并使用云技术传输数据,以便与生态系统中的其他资产进行通信,在该框架中发挥着关键作用。最大限度地提高市民的生活质量、更好地利用资源、降低成本和提高可持续性是智慧城市追求的最终目标。因此,从连接设备收集的数据将不断得到彻底分析,以更好地了解整个城市正在提供的服务;考虑到这一目标,它们可以用来提高整个系统的效率。机器人和物理机器是智慧城市不可分割的组成部分。嵌入式人工智能是一个研究领域,它深入研究这些技术,并探索它们如何适应现实世界的环境。它侧重于通过与周围环境的交互学习,而不是试图从静态数据集学习的互联网AI。嵌入式人工智能的目标是训练一个能够同时看到(计算机视觉)、说话(NLP)、导航和与环境交互(强化学习)以及推理(一般智能)的智能体。自动驾驶汽车和个人伴侣是当今从嵌入式人工智能中受益的一些例子。在本文中,我们试图对这一领域做一个简要的回顾。我们将介绍它的定义、特点和目前的成就,以及在it的不同组成部分(如Vision、NLP、RL)中使用的不同算法、方法和解决方案。然后,我们将探索所有可用的模拟器和3D可交互数据库,使该领域的研究可行。最后,我们将应对其挑战,并确定其未来研究的潜力。 摘要:A smart city can be seen as a framework, comprised of Information and Communication Technologies (ICT). An intelligent network of connected devices that collect data with their sensors and transmit them using cloud technologies in order to communicate with other assets in the ecosystem plays a pivotal role in this framework. Maximizing the quality of life of citizens, making better use of resources, cutting costs, and improving sustainability are the ultimate goals that a smart city is after. Hence, data collected from connected devices will continuously get thoroughly analyzed to gain better insights into the services that are being offered across the city; with this goal in mind that they can be used to make the whole system more efficient. Robots and physical machines are inseparable parts of a smart city. Embodied AI is the field of study that takes a deeper look into these and explores how they can fit into real-world environments. It focuses on learning through interaction with the surrounding environment, as opposed to Internet AI which tries to learn from static datasets. Embodied AI aims to train an agent that can See (Computer Vision), Talk (NLP), Navigate and Interact with its environment (Reinforcement Learning), and Reason (General Intelligence), all at the same time. Autonomous driving cars and personal companions are some of the examples that benefit from Embodied AI nowadays. In this paper, we attempt to do a concise review of this field. We will go through its definitions, its characteristics, and its current achievements along with different algorithms, approaches, and solutions that are being used in different components of it (e.g. Vision, NLP, RL). We will then explore all the available simulators and 3D interactable databases that will make the research in this area feasible. Finally, we will address its challenges and identify its potentials for future research.

【37】 Wind Power Projection using Weather Forecasts by Novel Deep Neural Networks 标题:基于新型深度神经网络的天气预报风电预测 链接:https://arxiv.org/abs/2108.09797

作者:Alagappan Swaminathan,Venkatakrishnan Sutharsan,Tamilselvi Selvaraj 机构:Selvaraj, Associate Professor, SSN College of Engineering, Kalavakkam, Chennai 备注:27 pages, 12 figures, 12 tables, 7 equations, 22 references 摘要:从传统的能源生产方法过渡到可再生能源生产需要更好地预测即将到来的可再生能源供应。在风力发电生产中,由于风力的间歇性,预测产量的误差是不可能消除的。对于成功的电网整合,了解预测风力发电量时产生的不确定性并利用这些信息建立准确可靠的预测至关重要。这可以通过观察风力发电量的波动以及不同参数(如风速、温度和风向)的变化来实现,并得出相同参数的函数依赖关系。使用优化的机器学习算法,可以在观测中发现模糊模式并获得有意义的数据,然后可以使用这些数据准确预测风力发电需求。利用Gamesa位于Bableshwar的风电场提供的所需数据,本文探讨了利用功率曲线计算风电预测的参数模型和非参数模型的使用。对获得的结果进行比较,以更好地理解所用模型的准确性,并根据给定数据集确定预测风力发电量的最合适模型。 摘要:The transition from conventional methods of energy production to renewable energy production necessitates better prediction models of the upcoming supply of renewable energy. In wind power production, error in forecasting production is impossible to negate owing to the intermittence of wind. For successful power grid integration, it is crucial to understand the uncertainties that arise in predicting wind power production and use this information to build an accurate and reliable forecast. This can be achieved by observing the fluctuations in wind power production with changes in different parameters such as wind speed, temperature, and wind direction, and deriving functional dependencies for the same. Using optimized machine learning algorithms, it is possible to find obscured patterns in the observations and obtain meaningful data, which can then be used to accurately predict wind power requirements . Utilizing the required data provided by the Gamesa's wind farm at Bableshwar, the paper explores the use of both parametric and the non-parametric models for calculating wind power prediction using power curves. The obtained results are subject to comparison to better understand the accuracy of the utilized models and to determine the most suitable model for predicting wind power production based on the given data set.

【38】 Rainfall-runoff prediction using a Gustafson-Kessel clustering based Takagi-Sugeno Fuzzy model 标题:基于Gustafson-Kessel聚类的Takagi-Sugeno模糊模型在降雨径流预测中的应用 链接:https://arxiv.org/abs/2108.09684

作者:Subhrasankha Dey,Tanmoy Dam 机构: The proposed model is validated using the rainfall-runoffdata collected from the sensors installed on the campus of theIndian Institute of Technology 备注:This paper is underreview to IEEE SSCI 2022 摘要:降雨径流模型使用基于物理的方法或基于系统的方法预测地表径流。Takagi-Sugeno(TS)模糊模型是一种基于系统的方法,近几十年来,由于与其他现有模型相比在预测方面的一些优势和更高的准确性,是水文学家的一种流行建模选择。在本文中,我们提出了一个新的降雨-径流模型,该模型采用基于GK聚类的TS模糊模型。我们给出了GK算法与其他两种聚类算法的性能比较指标:(i)模糊C-均值(FCM)和(ii)减法聚类(SC)。我们提出的TS模糊模型使用以下方法预测地表径流:(i)流域内观测到的降雨和(ii)流域出口先前观测到的降雨流量。利用安装在哈拉格布尔印度理工学院校园内的传感器收集的降雨径流数据验证了所提出的模型。通过不同的验证指标,得到了该模型的最优规则数。对每种聚类算法的四个性能标准:均方根误差(RMSE)、效率系数(CE)、体积误差(VE)和相关确定系数(R)进行了定量比较研究。 摘要:A rainfall-runoff model predicts surface runoff either using a physically-based approach or using a systems-based approach. Takagi-Sugeno (TS) Fuzzy models are systems-based approaches and a popular modeling choice for hydrologists in recent decades due to several advantages and improved accuracy in prediction over other existing models. In this paper, we propose a new rainfall-runoff model developed using Gustafson-Kessel (GK) clustering-based TS Fuzzy model. We present comparative performance measures of GK algorithms with two other clustering algorithms: (i) Fuzzy C-Means (FCM), and (ii)Subtractive Clustering (SC). Our proposed TS Fuzzy model predicts surface runoff using: (i) observed rainfall in a drainage basin and (ii) previously observed precipitation flow in the basin outlet. The proposed model is validated using the rainfall-runoff data collected from the sensors installed on the campus of the Indian Institute of Technology, Kharagpur. The optimal number of rules of the proposed model is obtained by different validation indices. A comparative study of four performance criteria: RootMean Square Error (RMSE), Coefficient of Efficiency (CE), Volumetric Error (VE), and Correlation Coefficient of Determination(R) have been quantitatively demonstrated for each clustering algorithm.

【39】 Evolutionary Ensemble Learning for Multivariate Time Series Prediction 标题:进化集成学习在多变量时间序列预测中的应用 链接:https://arxiv.org/abs/2108.09659

作者:Hui Song,A. K. Qin,Flora D. Salim 摘要:多变量时间序列(MTS)预测在金融、能源和交通等领域发挥着关键作用,其中每个单独的时间序列对应于从特定数据源(即所谓的通道)收集的数据。构建MTS预测模型(PM)的典型流程包括从所有可用通道中选择一个子集,从所选通道中提取特征,并基于提取的特征构建PM,其中每个组件都涉及某些优化任务,即通道选择、特征提取(FE)方法,和PM,以及所选FE方法和PM的配置。因此,追求最佳预测性能对应于通过解决所有涉及的优化问题来优化管道。由于解决方案空间的巨大性,这是一项非常重要的任务。与大多数现有的针对优化管道某些组件的工作不同,我们提出了一种新的进化集成学习框架来整体优化整个管道。在该框架中,将特定的管道编码为候选解,并在不同的种群规模下应用多目标进化算法生成多个Pareto最优集(POSs)。最后,设计了选择性集成学习,从POSs中选择最优解子集,并使用贪婪序列选择和最小二乘法将其组合,以产生最终预测。我们实现了所提出的框架,并在两个实际应用中评估了我们的实现,即用电量预测和空气质量预测。与最新技术的性能比较表明了该方法的优越性。 摘要:Multivariate time series (MTS) prediction plays a key role in many fields such as finance, energy and transport, where each individual time series corresponds to the data collected from a certain data source, so-called channel. A typical pipeline of building an MTS prediction model (PM) consists of selecting a subset of channels among all available ones, extracting features from the selected channels, and building a PM based on the extracted features, where each component involves certain optimization tasks, i.e., selection of channels, feature extraction (FE) methods, and PMs as well as configuration of the selected FE method and PM. Accordingly, pursuing the best prediction performance corresponds to optimizing the pipeline by solving all of its involved optimization problems. This is a non-trivial task due to the vastness of the solution space. Different from most of the existing works which target at optimizing certain components of the pipeline, we propose a novel evolutionary ensemble learning framework to optimize the entire pipeline in a holistic manner. In this framework, a specific pipeline is encoded as a candidate solution and a multi-objective evolutionary algorithm is applied under different population sizes to produce multiple Pareto optimal sets (POSs). Finally, selective ensemble learning is designed to choose the optimal subset of solutions from the POSs and combine them to yield final prediction by using greedy sequential selection and least square methods. We implement the proposed framework and evaluate our implementation on two real-world applications, i.e., electricity consumption prediction and air quality prediction. The performance comparison with state-of-the-art techniques demonstrates the superiority of the proposed approach.

【40】 Signed Bipartite Graph Neural Networks 标题:符号二部图神经网络 链接:https://arxiv.org/abs/2108.09638

作者:Junjie Huang,Huawei Shen,Qi Cao,Shuchang Tao,Xueqi Cheng 机构:Data Intelligence System Research Center,CAS Key Laboratory of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China, University of Chinese Academy of Sciences, Beijing, China 备注:Accepted and to appear at CIKM2021 摘要:签名网络是这样的社交网络,既有积极的联系,也有消极的联系。已经发展了许多理论和算法来对此类网络进行建模(例如,平衡理论)。然而,以前的工作主要集中在节点类型相同的单方签名网络上。有符号二部网络不同于经典的有符号网络,后者包含两个不同的节点集和两个节点集之间的有符号链接。签名二部网络在商业、政治和学术等许多领域都很常见,但研究较少。在这项工作中,我们首先定义了同一组节点的有符号关系,为分析有符号二部网络提供了一个新的视角。然后,我们从两个角度对平衡理论进行了综合分析,并对几个真实数据集进行了分析。具体而言,在同行评议数据集中,我们发现有符号二部网络中的平衡同构比率在反驳阶段后增加。在这两种观点的指导下,我们提出了一种新的符号二部图神经网络(SBGNS)来学习符号二部网络的节点嵌入。SBGNS遵循大多数GNS消息传递方案,但我们为有符号二部网络设计了新的消息函数、聚合函数和更新函数。我们在四个真实数据集上验证了我们的模型在链路符号预测任务上的有效性,这是符号网络的主要机器学习任务。实验结果表明,与强基线方法(包括基于特征的方法和网络嵌入方法)相比,我们的SBGNN模型取得了显著的改进。 摘要:Signed networks are such social networks having both positive and negative links. A lot of theories and algorithms have been developed to model such networks (e.g., balance theory). However, previous work mainly focuses on the unipartite signed networks where the nodes have the same type. Signed bipartite networks are different from classical signed networks, which contain two different node sets and signed links between two node sets. Signed bipartite networks can be commonly found in many fields including business, politics, and academics, but have been less studied. In this work, we firstly define the signed relationship of the same set of nodes and provide a new perspective for analyzing signed bipartite networks. Then we do some comprehensive analysis of balance theory from two perspectives on several real-world datasets. Specifically, in the peer review dataset, we find that the ratio of balanced isomorphism in signed bipartite networks increased after rebuttal phases. Guided by these two perspectives, we propose a novel Signed Bipartite Graph Neural Networks (SBGNNs) to learn node embeddings for signed bipartite networks. SBGNNs follow most GNNs message-passing scheme, but we design new message functions, aggregation functions, and update functions for signed bipartite networks. We validate the effectiveness of our model on four real-world datasets on Link Sign Prediction task, which is the main machine learning task for signed networks. Experimental results show that our SBGNN model achieves significant improvement compared with strong baseline methods, including feature-based methods and network embedding methods.

【41】 DisenKGAT: Knowledge Graph Embedding with Disentangled Graph Attention Network 标题:DisenKGAT:基于解缠图注意力网络的知识图嵌入 链接:https://arxiv.org/abs/2108.09628

作者:Junkang Wu,Wentao Shi,Xuezhi Cao,Jiawei Chen,Wenqiang Lei,Fuzheng Zhang,Wei Wu,Xiangnan He 机构:University of Science and Technology of China,Meituan,National University of Singapore. 备注:CIKM2021 摘要:知识图完成(KGC)因其对众多下游任务的卓越贡献而成为深度学习社区关注的焦点。尽管最近在KGC方面的工作激增,但它们仍然不足以准确捕捉复杂的关系,因为它们采用了单一的静态表示。在这项工作中,我们为KGC提出了一种新的解纠缠知识图注意网络(DisenKGAT),它利用微观解纠缠和宏观解纠缠来利用知识图(KG)背后的表示。为了实现微观分离,我们提出了一种新的关系感知聚合来学习不同的组件表示。对于宏观解纠缠,我们利用互信息作为正则化来增强独立性。在解纠缠的帮助下,我们的模型能够根据给定场景生成自适应表示。此外,我们的工作具有很强的鲁棒性和灵活性,以适应各种评分函数。在公共基准数据集上进行了大量的实验,以验证在准确性和可解释性方面,解隐GAT优于现有方法。 摘要:Knowledge graph completion (KGC) has become a focus of attention across deep learning community owing to its excellent contribution to numerous downstream tasks. Although recently have witnessed a surge of work on KGC, they are still insufficient to accurately capture complex relations, since they adopt the single and static representations. In this work, we propose a novel Disentangled Knowledge Graph Attention Network (DisenKGAT) for KGC, which leverages both micro-disentanglement and macro-disentanglement to exploit representations behind Knowledge graphs (KGs). To achieve micro-disentanglement, we put forward a novel relation-aware aggregation to learn diverse component representation. For macro-disentanglement, we leverage mutual information as a regularization to enhance independence. With the assistance of disentanglement, our model is able to generate adaptive representations in terms of the given scenario. Besides, our work has strong robustness and flexibility to adapt to various score functions. Extensive experiments on public benchmark datasets have been conducted to validate the superiority of DisenKGAT over existing methods in terms of both accuracy and explainability.

【42】 Personalised Federated Learning: A Combinational Approach 标题:个性化联合学习:一种组合方法 链接:https://arxiv.org/abs/2108.09618

作者:Sone Kyaw Pye,Han Yu 机构:School of Computer Science and Engineering, Nanyang Technological University, Singapore 备注:in Proceedings of the 1st International Student Conference on Artificial Intelligence (STCAI'21), 2021 摘要:联邦学习(FL)是一种分布式机器学习方法,涉及多个客户端协作训练共享模型。这样一个系统的优点是来自多个客户机的更多训练数据,但数据可以是非相同和独立分布的(非i.i.d.)。隐私和完整性保护功能,如差异隐私(DP)和鲁棒聚合(RA)通常用于FL。在这项工作中,我们表明,在常见的深度学习任务中,FL模型的性能因客户和情况而异,并且由于非i.i.d.数据,FL模型有时比本地模型的性能更差。其次,我们表明,合并DP和RA会进一步降低性能。然后,我们对FL常用个性化方法的不同组合(如微调、专家组合、多任务学习和知识提炼)对性能的影响进行了研究。据观察,个性化方法的某些组合在某些场景中更具影响力,而其他方法总是能提高性能,并且组合方法优于单个方法。大多数客户通过组合个性化FL获得了更好的性能,并从非i.i.d.数据、DP和RA导致的性能下降中恢复过来。 摘要:Federated learning (FL) is a distributed machine learning approach involving multiple clients collaboratively training a shared model. Such a system has the advantage of more training data from multiple clients, but data can be non-identically and independently distributed (non-i.i.d.). Privacy and integrity preserving features such as differential privacy (DP) and robust aggregation (RA) are commonly used in FL. In this work, we show that on common deep learning tasks, the performance of FL models differs amongst clients and situations, and FL models can sometimes perform worse than local models due to non-i.i.d. data. Secondly, we show that incorporating DP and RA degrades performance further. Then, we conduct an ablation study on the performance impact of different combinations of common personalization approaches for FL, such as finetuning, mixture-of-experts ensemble, multi-task learning, and knowledge distillation. It is observed that certain combinations of personalization approaches are more impactful in certain scenarios while others always improve performance, and combination approaches are better than individual ones. Most clients obtained better performance with combined personalized FL and recover from performance degradation caused by non-i.i.d. data, DP, and RA.

【43】 Apache Submarine: A Unified Machine Learning Platform Made Simple 标题:阿帕奇潜艇:一个简化的统一机器学习平台 链接:https://arxiv.org/abs/2108.09615

作者:Kai-Hsun Chen,Huan-Ping Su,Wei-Chiu Chuang,Hung-Chang Hsiao,Wangda Tan,Zhankun Tang,Xun Liu,Yanbo Liang,Wen-Chih Lo,Wanqiang Ji,Byron Hsu,Keqiu Hu,HuiYang Jian,Quan Zhou,Chien-Min Wang 机构:Academia Sinica, Cloudera, National Cheng Kung University, DiDi, Facebook, UC Berkeley, LinkedIn, KE Holdings, Ant Group 备注:9 pages 摘要:随着机器学习的应用越来越广泛,有必要为基础设施管理员和用户(包括专家数据科学家和公民数据科学家)提供一个机器学习平台,以提高他们的生产率。然而,现有的机器学习平台在解决“机器学习技术债务”(如胶码、可复制性和可移植性)方面装备不足。此外,现有平台只考虑专家数据科学家,因此它们对基础设施管理员缺乏灵活性,对公民数据科学家也不友好。我们建议潜艇,一个统一的机器学习平台,以解决这些挑战。 摘要:As machine learning is applied more widely, it is necessary to have a machine learning platform for both infrastructure administrators and users including expert data scientists and citizen data scientists to improve their productivity. However, existing machine learning platforms are ill-equipped to address the "Machine Learning tech debts" such as glue code, reproducibility, and portability. Furthermore, existing platforms only take expert data scientists into consideration, and thus they are inflexible for infrastructure administrators and non-user-friendly for citizen data scientists. We propose Submarine, a unified machine learning platform, to address the challenges.

【44】 Programmable FPGA-based Memory Controller 标题:基于可编程FPGA的存储控制器 链接:https://arxiv.org/abs/2108.09601

作者:Sasindu Wijeratne,Sanket Pattnaik,Zhiyu Chen,Rajgopal Kannan,Viktor Prasanna 机构:∗Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, USA, †US Army Research Lab, Los Angeles, USA 摘要:即使DRAM技术有了一代又一代的改进,内存访问延迟仍然是应用程序加速器的主要瓶颈,这主要是由于内存接口IP的限制,无法充分考虑目标应用程序、使用的算法和加速器体系结构的变化。由于为不同的应用开发内存控制器非常耗时,本文介绍了一种模块化可编程内存控制器,它可以在可用的硬件资源上为不同的目标应用配置。该内存控制器有效地支持缓存线访问和大容量内存传输。用户可以根据FPGA上的可用逻辑资源、内存访问模式和外部内存规格配置控制器。模块化设计支持各种内存访问优化技术,包括请求调度、内部缓存和直接内存访问。这些技术有助于减少总体延迟,同时保持高持续带宽。我们在最先进的FPGA上实现了该系统,并使用两个广泛研究的领域评估了其性能:图形分析和深度学习工作负载。与商用内存控制器IP相比,CNN和GCN工作负载上的总体内存访问时间提高了58%。 摘要:Even with generational improvements in DRAM technology, memory access latency still remains the major bottleneck for application accelerators, primarily due to limitations in memory interface IPs which cannot fully account for variations in target applications, the algorithms used, and accelerator architectures. Since developing memory controllers for different applications is time-consuming, this paper introduces a modular and programmable memory controller that can be configured for different target applications on available hardware resources. The proposed memory controller efficiently supports cache-line accesses along with bulk memory transfers. The user can configure the controller depending on the available logic resources on the FPGA, memory access pattern, and external memory specifications. The modular design supports various memory access optimization techniques including, request scheduling, internal caching, and direct memory access. These techniques contribute to reducing the overall latency while maintaining high sustained bandwidth. We implement the system on a state-of-the-art FPGA and evaluate its performance using two widely studied domains: graph analytics and deep learning workloads. We show improved overall memory access time up to 58% on CNN and GCN workloads compared with commercial memory controller IPs.

【45】 SERF: Towards better training of deep neural networks using log-Softplus ERror activation Function 标题:SERF:使用LOG-Softplus误差激活函数更好地训练深度神经网络 链接:https://arxiv.org/abs/2108.09598

作者:Sayan Nag,Mayukh Bhattacharyya 机构:University of Toronto, Stony Brook University 摘要:激活函数在决定训练动态和神经网络性能方面起着关键作用。广泛采用的激活函数ReLU尽管简单有效,但也存在一些缺点,包括即将消亡的ReLU问题。为了解决这些问题,我们提出了一种新的激活函数,称为Serf,它是自正则的,本质上是非单调的。和米什一样,农奴也属于时髦的职能家族。基于计算机视觉(图像分类和目标检测)和自然语言处理(机器翻译、情感分类和多模态蕴涵)任务的多个实验,采用不同的最新架构,据观察,Serf的性能远远优于ReLU(基线)和其他激活功能,包括Swish和Mish,在更深层次的体系结构上具有更大的优势。消融研究进一步证明,基于Serf的体系结构在不同场景下的性能优于Swish和Mish,验证了Serf在不同深度、复杂度、优化器、学习率、批量大小、初始值设定者和退出率下的有效性和兼容性。最后,我们研究了Swish和Serf之间的数学关系,从而显示了Serf一阶导数中固有的预条件函数的影响,它提供了一种正则化效果,使梯度更平滑,优化速度更快。 摘要:Activation functions play a pivotal role in determining the training dynamics and neural network performance. The widely adopted activation function ReLU despite being simple and effective has few disadvantages including the Dying ReLU problem. In order to tackle such problems, we propose a novel activation function called Serf which is self-regularized and nonmonotonic in nature. Like Mish, Serf also belongs to the Swish family of functions. Based on several experiments on computer vision (image classification and object detection) and natural language processing (machine translation, sentiment classification and multimodal entailment) tasks with different state-of-the-art architectures, it is observed that Serf vastly outperforms ReLU (baseline) and other activation functions including both Swish and Mish, with a markedly bigger margin on deeper architectures. Ablation studies further demonstrate that Serf based architectures perform better than those of Swish and Mish in varying scenarios, validating the effectiveness and compatibility of Serf with varying depth, complexity, optimizers, learning rates, batch sizes, initializers and dropout rates. Finally, we investigate the mathematical relation between Swish and Serf, thereby showing the impact of preconditioner function ingrained in the first derivative of Serf which provides a regularization effect making gradients smoother and optimization faster.

【46】 Hierarchical Summarization for Longform Spoken Dialog 标题:长形口语对话的层次化摘要 链接:https://arxiv.org/abs/2108.09597

作者:Daniel Li,Thomas Chen,Albert Tung,Lydia Chilton 机构:Columbia University, New York, New York, USA, Microsoft, Redmond, Washington, USA, Stanford University, Palo Alto, California, USA, Lydia B. Chilton 摘要:每天我们都被口语对话包围着。这一媒介在观众席上传递丰富多样的信息流;然而,系统地理解对话常常是非常重要的。尽管口语对话非常普遍,但自动语音理解和高质量信息提取仍然非常差,尤其是与书面散文相比。此外,与理解文本相比,听觉交流带来了许多额外的挑战,如说话人不流利、非正式的散文风格和缺乏结构。这些问题都表明需要一个独特的语音定制交互系统来帮助用户理解和导航口语领域。虽然个人自动语音识别(ASR)和文本摘要方法已经存在,但它们是不完善的技术;既不考虑用户的目的和意图,也不解决口语引起的并发症。因此,我们设计了一个两阶段ASR和文本摘要流水线,并提出了一套语义分割和合并算法来解决这些语音建模难题。我们的系统使用户能够轻松浏览和导航内容,以及从这些底层技术中的错误中恢复。最后,我们对系统进行了评估,强调了用户对分层摘要的偏好,将其作为快速浏览音频和识别用户感兴趣内容的工具。 摘要:Every day we are surrounded by spoken dialog. This medium delivers rich diverse streams of information auditorily; however, systematically understanding dialog can often be non-trivial. Despite the pervasiveness of spoken dialog, automated speech understanding and quality information extraction remains markedly poor, especially when compared to written prose. Furthermore, compared to understanding text, auditory communication poses many additional challenges such as speaker disfluencies, informal prose styles, and lack of structure. These concerns all demonstrate the need for a distinctly speech tailored interactive system to help users understand and navigate the spoken language domain. While individual automatic speech recognition (ASR) and text summarization methods already exist, they are imperfect technologies; neither consider user purpose and intent nor address spoken language induced complications. Consequently, we design a two stage ASR and text summarization pipeline and propose a set of semantic segmentation and merging algorithms to resolve these speech modeling challenges. Our system enables users to easily browse and navigate content as well as recover from errors in these underlying technologies. Finally, we present an evaluation of the system which highlights user preference for hierarchical summarization as a tool to quickly skim audio and identify content of interest to the user.

【47】 Learning Causal Models of Autonomous Agents using Interventions 标题:使用干预的自治Agent的学习因果模型 链接:https://arxiv.org/abs/2108.09586

作者:Pulkit Verma,Siddharth Srivastava 机构:School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ , USA 备注:IJCAI 2021 Workshop on Generalization in Planning 摘要:人工智能系统广泛应用的几个障碍之一是缺乏可解释性要求,这使得非专业人员能够确保此类系统的安全可靠行为。我们扩展了对agent评估模块的分析,该模块允许AI系统在模拟器中执行高级指令序列,并回答用户关于其执行动作序列的查询。我们证明了这样一种原始的查询响应能力足以在平稳、完全可观测和确定性的环境中有效地导出用户可解释的系统因果模型。我们还介绍了动态因果决策网络(DCDNs),该网络捕获带状域的因果结构。本文还对不同类型的查询进行了比较分析,分析了回答这些查询所需的计算需求以及评估其响应以学习正确模型所需的工作。 摘要:One of the several obstacles in the widespread use of AI systems is the lack of requirements of interpretability that can enable a layperson to ensure the safe and reliable behavior of such systems. We extend the analysis of an agent assessment module that lets an AI system execute high-level instruction sequences in simulators and answer the user queries about its execution of sequences of actions. We show that such a primitive query-response capability is sufficient to efficiently derive a user-interpretable causal model of the system in stationary, fully observable, and deterministic settings. We also introduce dynamic causal decision networks (DCDNs) that capture the causal structure of STRIPS-like domains. A comparative analysis of different classes of queries is also presented in terms of the computational requirements needed to answer them and the efforts required to evaluate their responses to learn the correct model.

【48】 Automating Crystal-Structure Phase Mapping: Combining Deep Learning with Constraint Reasoning 标题:深度学习与约束推理相结合的晶体结构相图自动化 链接:https://arxiv.org/abs/2108.09523

作者:Di Chen,Yiwei Bai,Sebastian Ament,Wenting Zhao,Dan Guevarra,Lan Zhou,Bart Selman,R. Bruce van Dover,John M. Gregoire,Carla P. Gomes 机构: Cornell University, Department of Computer Science, California Institute of Technology, Joint Center for Artificial Photosynthesis, Cornell University, Department of Materials Science and Engineering 摘要:晶体结构相位映射是材料科学中一个核心的长期挑战,需要识别合成材料中的晶体结构或其混合物。材料科学专家擅长解决简单系统,但无法解决复杂系统,这在高通量材料发现中造成了一个主要瓶颈。在此,我们展示了如何自动化晶体结构相位映射。我们将相位映射描述为一个无监督模式分离问题,并描述了如何使用深度推理网络(DRNET)解决它。DRNET将深度学习与约束推理结合起来,以整合科学先验知识,因此只需要少量(未标记的)数据。DRNET利用和放大控制晶体混合物的热力学规则的丰富先验知识,并将约束推理无缝集成到神经网络优化中,从而弥补了有限的数据。DRNET设计了一个可解释的潜在空间,用于编码先验知识域约束,并将约束推理无缝地集成到神经网络优化中。DRNET在晶体结构相图、揭示Bi-Cu-V氧化物相图以及帮助发现太阳能燃料材料方面超越了以往的方法。 摘要:Crystal-structure phase mapping is a core, long-standing challenge in materials science that requires identifying crystal structures, or mixtures thereof, in synthesized materials. Materials science experts excel at solving simple systems but cannot solve complex systems, creating a major bottleneck in high-throughput materials discovery. Herein we show how to automate crystal-structure phase mapping. We formulate phase mapping as an unsupervised pattern demixing problem and describe how to solve it using Deep Reasoning Networks (DRNets). DRNets combine deep learning with constraint reasoning for incorporating scientific prior knowledge and consequently require only a modest amount of (unlabeled) data. DRNets compensate for the limited data by exploiting and magnifying the rich prior knowledge about the thermodynamic rules governing the mixtures of crystals with constraint reasoning seamlessly integrated into neural network optimization. DRNets are designed with an interpretable latent space for encoding prior-knowledge domain constraints and seamlessly integrate constraint reasoning into neural network optimization. DRNets surpass previous approaches on crystal-structure phase mapping, unraveling the Bi-Cu-V oxide phase diagram, and aiding the discovery of solar-fuels materials.

【49】 Flikcer -- A Chrome Extension to Resolve Online Epileptogenic Visual Content with Real-Time Luminance Frequency Analysis 标题:实时亮度频率分析在线致痫视觉内容解析的Chrome扩展--Flikercer 链接:https://arxiv.org/abs/2108.09491

作者:Jaisal Kothari,Ashay Srivastava 机构:Student, Amity International School, Saket, New Delhi, Delhi Public School, RK Puram 摘要:具有快速亮度变化或具有高对比度空间模式的视频内容(称为致痫性视觉内容)可能会导致患有光敏性癫痫的观众癫痫发作,甚至会导致未受该疾病影响的用户感到不适。Flikcer是一款以网站和chrome扩展为形式的网络应用,旨在解决视频中的癫痫内容。它提供了癫痫发作的可能触发因素的数量。它还提供了这些触发器的时间戳以及更安全的视频版本,可免费下载。该算法是用Python编写的,使用机器学习和计算机视觉。该算法的一个关键方面是其计算效率,允许公共用户实时实现。 摘要:Video content with fast luminance variations, or with spatial patterns of high contrast - referred to as epileptogenic visual content - may induce seizures on viewers with photosensitive epilepsy, and even cause discomfort in users not affected by this disease. Flikcer is a web app in the form of a website and chrome extension which aims to resolve epileptic content in videos. It provides the number of possible triggers for a seizure. It also provides the timestamps for these triggers along with a safer version of the video, free to download. The algorithm is written in Python and uses machine learning and computer vision. A key aspect of the algorithm is its computational efficiency, allowing real time implementation for public users.

【50】 CushLEPOR: Customised hLEPOR Metric Using LABSE Distilled Knowledge Model to Improve Agreement with Human Judgements 标题:CushLEPOR:使用LABSE精炼知识模型改进与人类判断的一致性的定制hLEPOR度量 链接:https://arxiv.org/abs/2108.09484

作者:Lifeng Han,Irina Sorokina,Gleb Erofeev,Serge Gladkoff 机构: ADAPT Research Centre, DCU, Ireland, Logrus Global, Translation & Localization 备注:Extended work from MT SUMMIT 2021: Gleb Erofeev, Irina Sorokina, Lifeng Han, and Serge Gladkoff. 2021. cushLEPOR uses LABSE distilled knowledge to improve correlation with human translation evaluations. In Proceedings for the MT summit - User Track (In Press), online. Association for Computa- tional Linguistics & AMTA 摘要:在研究人员努力信任自动度量的同时,人工评估的成本一直很高。为了解决这个问题,我们建议通过利用预先训练的语言模型(PLM)和有限的可用人类标记分数来定制传统指标。我们首先重新介绍了hLEPOR度量因子,然后介绍了我们开发的Python可移植版本,该版本实现了hLEPOR度量中权重参数的自动调整。然后,我们提出了定制的hLEPOR(cushLEPOR),它使用LABSE提取的知识模型,通过自动优化与cushLEPOR部署到的精确机器翻译语言对相关的因子权重,来改进度量与人类判断的一致性。我们还优化了基于MQM和pSQM框架的英语-德语和汉英语言对的cushLEPOR人类评估数据。实验研究表明,cushLEPOR以更低的成本提高了hLEPOR的性能,使其与PLM(如LABSE)达成更好的协议,并与人类评估(包括MQM和pSQM分数)达成更好的协议,并且产生了比BLEU更好的性能(数据可在{https://github.com/poethan/cushLEPOR}). 摘要:Human evaluation has always been expensive while researchers struggle to trust the automatic metrics. To address this, we propose to customise traditional metrics by taking advantages of the pre-trained language models (PLMs) and the limited available human labelled scores. We first re-introduce the hLEPOR metric factors, followed by the Python portable version we developed which achieved the automatic tuning of the weighting parameters in hLEPOR metric. Then we present the customised hLEPOR (cushLEPOR) which uses LABSE distilled knowledge model to improve the metric agreement with human judgements by automatically optimised factor weights regarding the exact MT language pairs that cushLEPOR is deployed to. We also optimise cushLEPOR towards human evaluation data based on MQM and pSQM framework on English-German and Chinese-English language pairs. The experimental investigations show cushLEPOR boosts hLEPOR performances towards better agreements to PLMs like LABSE with much lower cost, and better agreements to human evaluations including MQM and pSQM scores, and yields much better performances than BLEU (data available at \url{https://github.com/poethan/cushLEPOR}).

【51】 MimicBot: Combining Imitation and Reinforcement Learning to win in Bot Bowl 标题:MimicBot:结合模仿和强化学习在机器人碗中取胜 链接:https://arxiv.org/abs/2108.09478

作者:Nicola Pezzotti 机构:AI, Data Science and Digital Twin Department, Philips Research, Eindhoven, The Netherlands, Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, The Netherlands, Editor: Kevin Murphy and Bernhard Sch¨olkopf 摘要:本文描述了一个混合智能体,该智能体经过训练,可以在参加Bot Bowl III比赛的幻想足球AI中玩。代理MimicBot使用专门设计的深层策略网络实现,并使用模仿和强化学习相结合的方式进行训练。以前在这种情况下使用强化学习方法的尝试失败的原因有很多,例如,由于环境中固有的随机性和可用的大量且不均匀的行动,课程学习方法未能始终击败随机付费的代理。目前,没有任何机器学习方法可以打败脚本机器人,它利用游戏中的领域知识。我们的解决方案,由于模仿学习和混合决策过程,始终击败了这种脚本代理。此外,我们还阐明了如何在强化学习环境中更有效地训练,同时大幅提高样本效率。MimicBot是Bot Bowl III竞赛的获胜者,目前是最先进的解决方案。 摘要:This paper describe an hybrid agent trained to play in Fantasy Football AI which participated in the Bot Bowl III competition. The agent, MimicBot, is implemented using a specifically designed deep policy network and trained using a combination of imitation and reinforcement learning. Previous attempts in using a reinforcement learning approach in such context failed for a number of reasons, e.g. due to the intrinsic randomness in the environment and the large and uneven number of actions available, with a curriculum learning approach failing to consistently beat a randomly paying agent. Currently no machine learning approach can beat a scripted bot which makes use of the domain knowledge on the game. Our solution, thanks to an imitation learning and a hybrid decision-making process, consistently beat such scripted agents. Moreover we shed lights on how to more efficiently train in a reinforcement learning setting while drastically increasing sample efficiency. MimicBot is the winner of the Bot Bowl III competition, and it is currently the state-of-the-art solution.

【52】 Robust Ensembling Network for Unsupervised Domain Adaptation 标题:用于无监督领域自适应的鲁棒集成网络 链接:https://arxiv.org/abs/2108.09473

作者:Han Sun,Lei Lin,Ningzhong Liu,Huiyu Zhou 机构: Nanjing University of Aeronautics and Astronautics, Jiangsu Nanjing, China, School of Informatics, University of Leicester, Leicester LE,RH, U.K 备注:14 pages, 4 figures. accepted by PRICA-2021. code: this https URL 摘要:最近,为了解决无监督领域自适应(UDA)问题,人们提出了大量的研究来实现可转移模型。其中,最常用的方法是对抗域自适应,它可以缩短源域和目标域之间的距离。尽管对抗式学习非常有效,但它仍然会导致网络的不稳定性和混淆类别信息的缺点。在本文中,我们提出了一种用于UDA的鲁棒加密网络(REN),该网络应用鲁棒时间加密教师网络来学习全局信息以进行域转移。具体地说,REN主要包括教师网络和学生网络,其执行标准域适应训练并更新教师网络的权重。此外,我们还提出了一种双网络条件对抗性丢失来提高鉴别器的能力。最后,为了提高学生网络的基本能力,我们利用一致性约束来平衡学生网络和教师网络之间的误差。在几个UDA数据集上的大量实验结果表明,与其他最先进的UDA算法相比,我们的模型是有效的。 摘要:Recently, in order to address the unsupervised domain adaptation (UDA) problem, extensive studies have been proposed to achieve transferrable models. Among them, the most prevalent method is adversarial domain adaptation, which can shorten the distance between the source domain and the target domain. Although adversarial learning is very effective, it still leads to the instability of the network and the drawbacks of confusing category information. In this paper, we propose a Robust Ensembling Network (REN) for UDA, which applies a robust time ensembling teacher network to learn global information for domain transfer. Specifically, REN mainly includes a teacher network and a student network, which performs standard domain adaptation training and updates weights of the teacher network. In addition, we also propose a dual-network conditional adversarial loss to improve the ability of the discriminator. Finally, for the purpose of improving the basic ability of the student network, we utilize the consistency constraint to balance the error between the student network and the teacher network. Extensive experimental results on several UDA datasets have demonstrated the effectiveness of our model by comparing with other state-of-the-art UDA algorithms.

【53】 DeepEdgeBench: Benchmarking Deep Neural Networks on Edge Devices 标题:DeepEdgeBench:边缘设备上的深度神经网络基准测试 链接:https://arxiv.org/abs/2108.09457

作者:Stephan Patrick Baller,Anshul Jindal,Mohak Chadha,Michael Gerndt 机构:∗Chair of Computer Architecture and Parallel Systems, Technische Universit¨at M¨unchen, Germany, Garching (near Munich), Germany 备注:12 pages, accepted at IC2E'21 摘要:EdgeAI(基于边缘计算的人工智能)在过去几年中得到了最积极的研究,用于处理各种大规模分布式AI应用程序,以满足严格的延迟要求。与此同时,许多公司已经发布了外形更小(功耗低、资源有限)的边缘设备,如流行的Raspberry Pi和Nvidia的Jetson Nano,用于在边缘计算环境中充当计算节点。尽管边缘设备在计算能力和硬件资源方面受到限制,但它们由加速器提供动力,以增强其性能。因此,研究基于人工智能的深度神经网络如何在资源有限的设备上运行是很有趣的。在这项工作中,我们介绍并比较了四种芯片上系统(SoCs)在不同深度学习模型和框架下的推理时间和功耗性能:Asus Tinker Edge R、Raspberry Pi 4、Google Coral Dev Board、Nvidia Jetson Nano和一种微控制器Arduino Nano 33 BLE。我们还提供了一种测量设备功耗、推断时间和精度的方法,可以很容易地扩展到其他设备。我们的结果显示,对于基于Tensorflow的量化模型,Google Coral开发板在推理时间和功耗方面提供了最佳性能。在推理计算时间较短的情况下,即MobileNetV2的计算时间不到29.3%,Jetson Nano的性能比其他设备更快。 摘要:EdgeAI (Edge computing based Artificial Intelligence) has been most actively researched for the last few years to handle variety of massively distributed AI applications to meet up the strict latency requirements. Meanwhile, many companies have released edge devices with smaller form factors (low power consumption and limited resources) like the popular Raspberry Pi and Nvidia's Jetson Nano for acting as compute nodes at the edge computing environments. Although the edge devices are limited in terms of computing power and hardware resources, they are powered by accelerators to enhance their performance behavior. Therefore, it is interesting to see how AI-based Deep Neural Networks perform on such devices with limited resources. In this work, we present and compare the performance in terms of inference time and power consumption of the four Systems on a Chip (SoCs): Asus Tinker Edge R, Raspberry Pi 4, Google Coral Dev Board, Nvidia Jetson Nano, and one microcontroller: Arduino Nano 33 BLE, on different deep learning models and frameworks. We also provide a method for measuring power consumption, inference time and accuracy for the devices, which can be easily extended to other devices. Our results showcase that, for Tensorflow based quantized model, the Google Coral Dev Board delivers the best performance, both for inference time and power consumption. For a low fraction of inference computation time, i.e. less than 29.3% of the time for MobileNetV2, the Jetson Nano performs faster than the other devices.

【54】 "Adversarial Examples" for Proof-of-Learning 链接:https://arxiv.org/abs/2108.09454

作者:Rui Zhang,Jian Liu,Yuan Ding,Qingbiao Wu,Kui Ren 机构:Zhejiang University 摘要:在S&P'21中,Jia等人提出了一种新的概念/机制,名为学习证明(PoL),它允许验证人通过证明训练过程的完整性来证明机器学习模型的所有权。它保证了对手不能以比证明者生成证明的成本更低的成本(计算和存储)构造有效证明。PoL证明包括一组在训练期间记录的中间模型,以及用于获得每个记录模型的相应数据点。Jia等人声称,仅仅知道最终模型和训练数据集的对手无法有效地找到一组具有正确数据点的中间模型。然而,在本文中,我们表明PoL容易受到“对抗性示例”的攻击!具体来说,与优化对抗性示例类似,我们可以使任意选择的数据点“生成”给定模型,从而有效地生成具有正确数据点的中间模型。我们从理论和经验上证明,我们能够以比证明人生成证明所需的成本少得多的成本生成有效证明,从而成功地打破了PoL。 摘要:In S&P '21, Jia et al. proposed a new concept/mechanism named proof-of-learning (PoL), which allows a prover to demonstrate ownership of a machine learning model by proving integrity of the training procedure. It guarantees that an adversary cannot construct a valid proof with less cost (in both computation and storage) than that made by the prover in generating the proof. A PoL proof includes a set of intermediate models recorded during training, together with the corresponding data points used to obtain each recorded model. Jia et al. claimed that an adversary merely knowing the final model and training dataset cannot efficiently find a set of intermediate models with correct data points. In this paper, however, we show that PoL is vulnerable to "adversarial examples"! Specifically, in a similar way as optimizing an adversarial example, we could make an arbitrarily-chosen data point "generate" a given model, hence efficiently generating intermediate models with correct data points. We demonstrate, both theoretically and empirically, that we are able to generate a valid proof with significantly less cost than generating a proof by the prover, thereby we successfully break PoL.

【55】 Learn-Explain-Reinforce: Counterfactual Reasoning and Its Guidance to Reinforce an Alzheimer's Disease Diagnosis Model 标题:学习-解释-强化:反事实推理及其对强化阿尔茨海默病诊断模型的指导 链接:https://arxiv.org/abs/2108.09451

作者:Kwanseok Oh,Jee Seok Yoon,Heung-Il Suk 机构: Yoon is with the Department of Brain and Cognitive Engi-neering, Korea University 备注:14 pages, 9 figures 摘要:关于疾病诊断模型的现有研究要么侧重于诊断模型学习以提高性能,要么侧重于对经过训练的诊断模型的直观解释。我们提出了一个新的学习-解释-强化(LEAR)框架,该框架将诊断模型学习、可视化解释生成(解释单元)和在可视化解释指导下训练的诊断模型强化(强化单元)统一起来。对于视觉解释,我们生成一个反事实映射,该映射将输入样本转换为预期目标标签。例如,反事实地图可以定位正常大脑图像中的假设异常,这可能导致其被诊断为阿尔茨海默病(AD)。我们认为,生成的反事实图代表了关于目标任务的数据驱动和模型诱导知识,即使用结构MRI进行AD诊断,这可能是加强训练诊断模型泛化的重要信息来源。为此,我们在反事实地图的指导下设计了一个基于注意的特征求精模块。解释和强化单元是相互的,可以迭代操作。通过对ADNI数据集的定性和定量分析,我们提出的方法得到了验证。通过消融研究和与现有方法的比较,证明了该方法的可理解性和保真度。 摘要:Existing studies on disease diagnostic models focus either on diagnostic model learning for performance improvement or on the visual explanation of a trained diagnostic model. We propose a novel learn-explain-reinforce (LEAR) framework that unifies diagnostic model learning, visual explanation generation (explanation unit), and trained diagnostic model reinforcement (reinforcement unit) guided by the visual explanation. For the visual explanation, we generate a counterfactual map that transforms an input sample to be identified as an intended target label. For example, a counterfactual map can localize hypothetical abnormalities within a normal brain image that may cause it to be diagnosed with Alzheimer's disease (AD). We believe that the generated counterfactual maps represent data-driven and model-induced knowledge about a target task, i.e., AD diagnosis using structural MRI, which can be a vital source of information to reinforce the generalization of the trained diagnostic model. To this end, we devise an attention-based feature refinement module with the guidance of the counterfactual maps. The explanation and reinforcement units are reciprocal and can be operated iteratively. Our proposed approach was validated via qualitative and quantitative analysis on the ADNI dataset. Its comprehensibility and fidelity were demonstrated through ablation studies and comparisons with existing methods.

【56】 Towards Personalized and Human-in-the-Loop Document Summarization 标题:走向个性化、人性化的文献摘要 链接:https://arxiv.org/abs/2108.09443

作者:Samira Ghodratnama 机构:PhD in Computer Science, A thesis submitted to Macquarie University, for the degree of, Doctor of Philosophy, Department of Computing, arXiv:,.,v, [cs.AI] , Aug 摘要:计算设备的普遍可用性和互联网的广泛使用不断产生大量数据。因此,任何给定主题的可用信息量都远远超出了人类正确处理的能力,导致了所谓的信息过载。为了有效地处理大量信息并生成对用户具有重要价值的内容,我们需要识别、合并和总结信息。数据摘要有助于收集相关信息,并将其收集成较短的格式,从而能够回答复杂的问题,获得新的见解并发现概念边界。这篇论文主要关注三个主要的挑战,以缓解信息过载使用新的总结技术。它还打算促进文档分析,以支持个性化信息提取。本论文将研究问题分为四个方面,包括(i)文档摘要中的特征工程,(ii)传统静态和不灵活的摘要,(iii)传统通用摘要方法,以及(iv)参考摘要的需要。我们提出了应对这些挑战的新方法:i)启用自动智能功能工程,ii)启用灵活和交互式摘要,iii)利用智能和个性化摘要方法。实验结果表明,与其他先进的模型相比,本文提出的方法是有效的。通过总结,我们进一步提出了不同领域信息过载问题的解决方案,包括网络流量数据、健康数据和业务流程数据。 摘要:The ubiquitous availability of computing devices and the widespread use of the internet have generated a large amount of data continuously. Therefore, the amount of available information on any given topic is far beyond humans' processing capacity to properly process, causing what is known as information overload. To efficiently cope with large amounts of information and generate content with significant value to users, we require identifying, merging and summarising information. Data summaries can help gather related information and collect it into a shorter format that enables answering complicated questions, gaining new insight and discovering conceptual boundaries. This thesis focuses on three main challenges to alleviate information overload using novel summarisation techniques. It further intends to facilitate the analysis of documents to support personalised information extraction. This thesis separates the research issues into four areas, covering (i) feature engineering in document summarisation, (ii) traditional static and inflexible summaries, (iii) traditional generic summarisation approaches, and (iv) the need for reference summaries. We propose novel approaches to tackle these challenges, by: i)enabling automatic intelligent feature engineering, ii) enabling flexible and interactive summarisation, iii) utilising intelligent and personalised summarisation approaches. The experimental results prove the efficiency of the proposed approaches compared to other state-of-the-art models. We further propose solutions to the information overload problem in different domains through summarisation, covering network traffic data, health data and business process data.

【57】 ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators 标题:ARAPReg:一种学习可变形形状生成器的尽可能刚性的正则化损失 链接:https://arxiv.org/abs/2108.09432

作者:Qixing Huang,Xiangru Huang,Bo Sun,Zaiwei Zhang,Junfeng Jiang,Chandrajit Bajaj 机构:UT Austin & MIT, Hohai University, With ARAPReg, Without ARAPReg, Source, Target 摘要:本文介绍了一种用于训练参数变形形状发生器的无监督损失法。关键思想是在生成的形状中加强局部刚性的保持。我们的方法建立在尽可能刚性(或ARAP)变形能量的近似值上。我们展示了如何通过ARAP能量的Hessian谱分解来发展无监督损耗。我们的损失通过一个稳健的标准很好地解耦了姿势和形状的变化。损失允许简单的闭式表达式。它易于训练,可插入任何标准生成模型,例如变分自动编码器(VAE)和自动解码器(AD)。实验结果表明,在人类、动物和骨骼等各种形状类别的公共基准数据集上,我们的方法比现有的形状生成方法有很大的优势。 摘要:This paper introduces an unsupervised loss for training parametric deformation shape generators. The key idea is to enforce the preservation of local rigidity among the generated shapes. Our approach builds on an approximation of the as-rigid-as possible (or ARAP) deformation energy. We show how to develop the unsupervised loss via a spectral decomposition of the Hessian of the ARAP energy. Our loss nicely decouples pose and shape variations through a robust norm. The loss admits simple closed-form expressions. It is easy to train and can be plugged into any standard generation models, e.g., variational auto-encoder (VAE) and auto-decoder (AD). Experimental results show that our approach outperforms existing shape generation approaches considerably on public benchmark datasets of various shape categories such as human, animal and bone.

【58】 Safe Transformative AI via a Windfall Clause 标题:通过意外之财条款实现安全的变革性人工智能 链接:https://arxiv.org/abs/2108.09404

作者:Paolo Bova,Jonas Emanuel Müller,Benjamin Harack 机构:ID ,‡ 摘要:社会很快就会看到变革性人工智能(TAI)。TAI的竞争模型表明,企业面临着在安全之前部署TAI系统的强大竞争压力。本文探讨了解决这一问题的一个提议,即“意外之财”条款,其中开发商承诺将任何最终的巨额利润的很大一部分捐赠给公益事业。然而,横财条款的一个关键挑战是,企业必须有理由加入该条款。企业还必须相信这些承诺是可信的。我们用暴利条款扩展了TAI竞争模型,以展示企业和决策者如何设计一个暴利条款来克服这些挑战。令人鼓舞的是,在各种各样的情况下,公司从加入意外之财条款中获益。我们还发现,当竞争更加危险时,企业更经常加入意外之财条款。即使企业相互了解对方的能力,企业也很少希望撤回对暴利条款的支持。这三个发现加强了使用意外收入条款促进TAI安全发展的理由。 摘要:Society could soon see transformative artificial intelligence (TAI). Models of competition for TAI show firms face strong competitive pressure to deploy TAI systems before they are safe. This paper explores a proposed solution to this problem, a Windfall Clause, where developers commit to donating a significant portion of any eventual extremely large profits to good causes. However, a key challenge for a Windfall Clause is that firms must have reason to join one. Firms must also believe these commitments are credible. We extend a model of TAI competition with a Windfall Clause to show how firms and policymakers can design a Windfall Clause which overcomes these challenges. Encouragingly, firms benefit from joining a Windfall Clause under a wide range of scenarios. We also find that firms join the Windfall Clause more often when the competition is more dangerous. Even when firms learn each other's capabilities, firms rarely wish to withdraw their support for the Windfall Clause. These three findings strengthen the case for using a Windfall Clause to promote the safe development of TAI.

【59】 A Multi-Task Learning Framework for COVID-19 Monitoring and Prediction of PPE Demand in Community Health Centres 标题:用于社区卫生中心个人防护用品需求监测和预测的多任务学习框架 链接:https://arxiv.org/abs/2108.09402

作者:Bonaventure Chidube Molokwu,Shaon Bhatta Shuvo,Ziad Kobti,Anne Snowdon 机构:School of Computer Science, University of Windsor, Windsor - Ontario, Canada, -,-,-, SCAN in Health 备注:6-page article/manuscript 摘要:目前,全世界都在寻求合适的缓解技术来控制和预防新型SARS-CoV-2的传播。在本文中,我们提出了一个独特的多任务学习框架,该框架共同预测SARS-CoV-2的影响以及特定人群在社区卫生中心的个人防护设备消费。通过研究和分析预测病毒(SARS-CoV-2)的影响,使我们能够了解SARS-CoV-2的性质以及促进其生长和传播的因素。因此,这些措施有助于提高广泛的认识;民众可以变得更加主动和谨慎,以缓解2019年冠状病毒病(COVID-19)的传播。此外,了解和预测个人防护设备的需求可以提高社区卫生中心医护人员的效率和安全性。由于SARS-CoV-2的新性质和菌株,这方面的文献和研究相对较少。这些现有文献试图使用基于代理的模型、机器学习模型或数学模型来解决问题陈述。有鉴于此,我们在这里的工作通过将问题陈述建模为多任务学习问题来补充现有文献。我们的研究结果表明,政府行为和人为因素是影响SARS-CoV-2传播的最重要决定因素。 摘要:Currently, the world seeks to find appropriate mitigation techniques to control and prevent the spread of the new SARS-CoV-2. In our paper herein, we present a peculiar Multi-Task Learning framework that jointly predicts the effect of SARS-CoV-2 as well as Personal-Protective-Equipment consumption in Community Health Centres for a given populace. Predicting the effect of the virus (SARS-CoV-2), via studies and analyses, enables us to understand the nature of SARS-CoV- 2 with reference to factors that promote its growth and spread. Therefore, these foster widespread awareness; and the populace can become more proactive and cautious so as to mitigate the spread of Corona Virus Disease 2019 (COVID- 19). Furthermore, understanding and predicting the demand for Personal Protective Equipment promotes the efficiency and safety of healthcare workers in Community Health Centres. Owing to the novel nature and strains of SARS-CoV-2, relatively few literature and research exist in this regard. These existing literature have attempted to solve the problem statement(s) using either Agent-based Models, Machine Learning Models, or Mathematical Models. In view of this, our work herein adds to existing literature via modeling our problem statements as Multi- Task Learning problems. Results from our research indicate that government actions and human factors are the most significant determinants that influence the spread of SARS-CoV-2.

【60】 InBiodiv-O: An Ontology for Indian Biodiversity Knowledge Management 标题:InBiodiv-O:印度生物多样性知识管理的本体 链接:https://arxiv.org/abs/2108.09372

作者:Archana Patel,Sarika Jain,Narayan C. Debnath,Vishal Lama 机构:Department of Software Engineering, School of Computing and Information Technology, Eastern International, University, Vietnam, Department of Computer Applications, National Institute of Technology Kurukshetra, Haryana, India 摘要:为了呈现生物多样性信息,需要一个语义模型来连接有关生物及其栖息地的各种数据。模型必须能够编码人类知识,以便理解机器。本体论提供了最丰富的机器可解释(而不仅仅是机器可处理)和显式语义,在生物多样性领域中被广泛使用。为生物多样性领域开发了各种本体论,但对当前景观的审查表明,这些本体论无法定义印度的生物多样性信息,尽管印度是众多国家之一。要对印度生物多样性信息进行语义分析,关键是要构建一个本体,从web上可用的非结构化数据格式中描述该领域的所有基本术语。因为,本体论的管理在很大程度上取决于实现这些本体论的领域,因此还没有定义理想的方法论以供普遍使用。本文的目的是开发一个本体论,该本体论基于所提出的方法对印度生物多样性信息的所有维度的所有术语进行语义编码。对提出的本体的综合评价表明,该本体在特定领域内构建良好。 摘要:To present the biodiversity information, a semantic model is required that connects all kinds of data about living creatures and their habitats. The model must be able to encode human knowledge for machines to be understood. Ontology offers the richest machine-interpretable (rather than just machine-processable) and explicit semantics that are being extensively used in the biodiversity domain. Various ontologies are developed for the biodiversity domain however a review of the current landscape shows that these ontologies are not capable to define the Indian biodiversity information though India is one of the megadiverse countries. To semantically analyze the Indian biodiversity information, it is crucial to build an ontology that describes all the essential terms of this domain from the unstructured format of the data available on the web. Since, the curation of the ontologies heavily depends on the domain where these are implemented hence there is no ideal methodology is defined yet to be ready for universal use. The aim of this article is to develop an ontology that semantically encodes all the terms of Indian biodiversity information in all its dimensions based on the proposed methodology. The comprehensive evaluation of the proposed ontology depicts that ontology is well built in the specified domain.

【61】 One Chatbot Per Person: Creating Personalized Chatbots based on Implicit User Profiles 标题:每人一个聊天机器人:基于隐式用户配置文件创建个性化聊天机器人 链接:https://arxiv.org/abs/2108.09355

作者:Zhengyi Ma,Zhicheng Dou,Yutao Zhu,Hanxun Zhong,Ji-Rong Wen 机构: Gaoling School of Artificial Intelligence, Renmin University of China, School of Information, Renmin University of China, Beijing Key Laboratory of Big Data Management and Analysis Methods, Key Laboratory of Data Engineering and Knowledge Engineering, MOE 备注:Accepted By SIGIR 2021, Full Papers 摘要:个性化聊天机器人专注于赋予聊天机器人一致的个性,让其表现得像真正的用户,提供更多信息反馈,并进一步充当个人助理。现有的个性化方法试图将多个文本描述合并为明确的用户配置文件。然而,获取这样的显式概要文件既昂贵又耗时,因此对于大规模的实际应用是不切实际的。此外,受限的预定义配置文件忽略了真实用户的语言行为,并且不能随着用户兴趣的变化自动更新。在本文中,我们建议从大规模用户对话历史中自动学习隐式用户配置文件,以构建个性化聊天机器人。具体来说,利用Transformer在语言理解方面的优势,我们训练了一个个性化的语言模型,以根据用户的历史响应构建通用的用户配置文件。为了突出显示输入帖子的相关历史响应,我们进一步建立了一个包含历史帖子响应对的键值存储网络,并构建了一个动态的帖子感知用户配置文件。动态配置文件主要描述用户对历史上类似帖子的响应内容和方式。为了明确地利用用户经常使用的单词,我们设计了一个个性化解码器来融合两种解码策略,包括从通用词汇中生成一个单词和从用户的个性化词汇中复制一个单词。在两个真实数据集上的实验表明,与现有方法相比,我们的模型有了显著的改进。 摘要:Personalized chatbots focus on endowing chatbots with a consistent personality to behave like real users, give more informative responses, and further act as personal assistants. Existing personalized approaches tried to incorporate several text descriptions as explicit user profiles. However, the acquisition of such explicit profiles is expensive and time-consuming, thus being impractical for large-scale real-world applications. Moreover, the restricted predefined profile neglects the language behavior of a real user and cannot be automatically updated together with the change of user interests. In this paper, we propose to learn implicit user profiles automatically from large-scale user dialogue history for building personalized chatbots. Specifically, leveraging the benefits of Transformer on language understanding, we train a personalized language model to construct a general user profile from the user's historical responses. To highlight the relevant historical responses to the input post, we further establish a key-value memory network of historical post-response pairs, and build a dynamic post-aware user profile. The dynamic profile mainly describes what and how the user has responded to similar posts in history. To explicitly utilize users' frequently used words, we design a personalized decoder to fuse two decoding strategies, including generating a word from the generic vocabulary and copying one word from the user's personalized vocabulary. Experiments on two real-world datasets show the significant improvement of our model compared with existing methods.

【62】 Data-driven Smart Ponzi Scheme Detection 标题:数据驱动的智能庞氏骗局检测 链接:https://arxiv.org/abs/2108.09305

作者:Yuzhi Liang,Weijing Wu,Kai Lei,Feiyang Wang 机构: Peking University 摘要:智能庞氏骗局是一种新的经济犯罪形式,它使用以太坊智能合约账户和加密货币实施庞氏骗局。智能庞氏骗局已经损害了许多投资者的利益,但对智能庞氏骗局检测的研究仍然十分有限。现有的智能庞氏骗局检测方法存在特征工程需要大量人力资源和模型可移植性差的问题。为了解决这些问题,本文提出了一种数据驱动的智能庞氏骗局检测系统。该系统利用动态图嵌入技术,基于与账户交易相关的多源多模式数据,自动学习账户的表示。与传统方法相比,该系统需要非常有限的人机交互。据我们所知,这是第一个通过动态图嵌入实现智能庞氏骗局检测的工作。实验结果表明,该方法明显优于现有的智能庞氏骗局检测方法。 摘要:A smart Ponzi scheme is a new form of economic crime that uses Ethereum smart contract account and cryptocurrency to implement Ponzi scheme. The smart Ponzi scheme has harmed the interests of many investors, but researches on smart Ponzi scheme detection is still very limited. The existing smart Ponzi scheme detection methods have the problems of requiring many human resources in feature engineering and poor model portability. To solve these problems, we propose a data-driven smart Ponzi scheme detection system in this paper. The system uses dynamic graph embedding technology to automatically learn the representation of an account based on multi-source and multi-modal data related to account transactions. Compared with traditional methods, the proposed system requires very limited human-computer interaction. To the best of our knowledge, this is the first work to implement smart Ponzi scheme detection through dynamic graph embedding. Experimental results show that this method is significantly better than the existing smart Ponzi scheme detection methods.

【63】 ECG-Based Heart Arrhythmia Diagnosis Through Attentional Convolutional Neural Networks 标题:基于注意力卷积神经网络的心电图心律失常诊断 链接:https://arxiv.org/abs/2108.10226

作者:Ziyu Liu,Xiang Zhang 机构:University of New South Wales, Sydney, Australia, Harvard Medical School, Harvard University, Boston, USA 备注:7 pages, in review of an international conference 摘要:心电图(ECG)信号是一种高度应用于个体心脏状况的测量方法,基于机器学习的心律失常自动诊断一直是人们努力的方向。然而,传统的机器学习模型在对原始数据进行预处理和特征提取时需要投入大量的时间和精力,而且分类性能较差。在这里,我们提出了一种新的深度学习模型,称为基于注意的卷积神经网络(ABCNN),它利用CNN和多头注意,直接处理原始心电信号,并自动提取信息相关性以准确检测心律失常。为了评估所提出的方法,我们在一个基准ECG数据集上进行了大量实验。我们的主要任务是从正常心跳中发现心律失常,同时从五种心律失常类型中准确识别心脏疾病。我们还提供了ABCNN的收敛性分析,并通过可视化直观地展示了提取表示的意义。实验结果表明,所提出的ABCNN性能优于广泛使用的基线,更接近于智能心脏病诊断系统。 摘要:Electrocardiography (ECG) signal is a highly applied measurement for individual heart condition, and much effort have been endeavored towards automatic heart arrhythmia diagnosis based on machine learning. However, traditional machine learning models require large investment of time and effort for raw data preprocessing and feature extraction, as well as challenged by poor classification performance. Here, we propose a novel deep learning model, named Attention-Based Convolutional Neural Networks (ABCNN) that taking advantage of CNN and multi-head attention, to directly work on the raw ECG signals and automatically extract the informative dependencies for accurate arrhythmia detection. To evaluate the proposed approach, we conduct extensive experiments over a benchmark ECG dataset. Our main task is to find the arrhythmia from normal heartbeats and, at the meantime, accurately recognize the heart diseases from five arrhythmia types. We also provide convergence analysis of ABCNN and intuitively show the meaningfulness of extracted representation through visualization. The experimental results show that the proposed ABCNN outperforms the widely used baselines, which puts one step closer to intelligent heart disease diagnosis system.

【64】 Cooperative Localization Utilizing Reinforcement Learning for 5G Networks 标题:基于强化学习的5G网络协同定位 链接:https://arxiv.org/abs/2108.10222

作者:Ghazaleh Kia,Laura Ruotsalainen 机构:dept. of Computer Science, University of Helsinki, Helsinki, Finland 备注:2 pages, 1 figure, presented as a poster at the Second 6G Wireless Summit 2020 摘要:近年来,为了实现自动驾驶汽车的出现,对精确定位的需求有所增加。为了让这些车辆进入智能城市的交通生态系统,需要一个精确的定位系统。为了实现精确定位,协同定位扮演着重要的角色。这种类型的定位计算车辆之间的距离测量值,并通过使用另一个车辆的更精确值来纠正其中一个车辆可能存在的错误值,从而提高位置精度。采用毫米波(mmWave)技术的5G信号支持精确的距离测量,5G网络提供设备到设备(D2D)通信,从而提高协作定位。本文的目的是为自动驾驶车辆提供一种精确的协作定位,利用强化学习技术为5G信号选择最精确和合适的距离测量技术,这种定位不太容易出错。 摘要:The demand for accurate localization has risen in recent years to enable the emerging of autonomous vehicles. To have these vehicles in the traffic ecosystem of smart cities, the need for an accurate positioning system is emphasized. To realize accurate positioning, collaborative localization plays an important role. This type of localization computes range measurements between vehicles and improves the accuracy of position by correcting the possibly faulty values of one of them by using the more accurate values of the other. 5G signals with the technology of Millimeter Wave (mmWave) support precise range measurements and 5G networks provide Device to Device (D2D) communication which improves collaborative localization. The aim of this paper is to provide an accurate collaborative positioning for autonomous vehicles, which is less prone to errors utilizing reinforcement learning technique for selecting the most accurate and suitable range measurement technique for the 5G signal.

【65】 Anomaly Detection Based on Generalized Gaussian Distribution approach for Ultra-Wideband (UWB) Indoor Positioning System 标题:基于广义高斯分布的超宽带室内定位系统异常检测 链接:https://arxiv.org/abs/2108.10210

作者:Fuhu Che,Qasim Zeeshan Ahmed,Faheem A. Khan,Pavlos I. Lazaridis 机构:University of Huddersfield, Huddersfield, UK 摘要:随着物联网(IoT)的迅速发展,室内定位系统(IPS)引起了学术界的极大兴趣。超宽带(UWB)是一种新兴技术,可用于IPS,因为它提供厘米级的精度。然而,超宽带系统在实际应用中仍然面临着一些技术挑战,其中之一就是非视距(NLoS)信号传播。几种机器学习方法已被应用于非视距分量识别。然而,当数据包含非常少量的非视距分量时,现有算法很难对其进行分类。本文主要研究基于高斯分布(GD)和广义高斯分布(GGD)算法的异常检测方法来检测和识别非视距分量。仿真结果表明,该方法能够提供鲁棒的非视距分量识别,提高了非视距信号的分类精度,从而显著提高了超宽带定位系统的定位精度。 摘要:With the rapid development of the Internet of Things (IoT), Indoor Positioning System (IPS) has attracted significant interest in academic research. Ultra-Wideband (UWB) is an emerging technology that can be employed for IPS as it offers centimetre-level accuracy. However, the UWB system still faces several technical challenges in practice, one of which is Non-Line-of-Sight (NLoS) signal propagation. Several machine learning approaches have been applied for the NLoS component identification. However, when the data contains a very small amount of NLoS components it becomes very difficult for existing algorithms to classify them. This paper focuses on employing an anomaly detection approach based on Gaussian Distribution (GD) and Generalized Gaussian Distribution (GGD) algorithms to detect and identify the NLoS components. The simulation results indicate that the proposed approach can provide a robust NLoS component identification which improves the NLoS signal classification accuracy which results in significant improvement in the UWB positioning system.

【66】 Emotion Recognition from Multiple Modalities: Fundamentals and Methodologies 标题:多通道情绪识别的基本原理和方法 链接:https://arxiv.org/abs/2108.10152

作者:Sicheng Zhao,Guoli Jia,Jufeng Yang,Guiguang Ding,Kurt Keutzer 备注:Accepted by IEEE Signal Processing Magazine (SPM) 摘要:人类是情感动物。当我们表达情感时,往往涉及多种形式,无论是明确表达(如面部表情、言语)还是隐含表达(如文本、图像)。使机器具有情绪智能,即识别、解释、处理和模拟情绪,变得越来越重要。在本教程中,我们将讨论多模态情感识别(MER)的几个关键方面。我们首先简要介绍了广泛使用的情感表征模型和情感模式。然后,我们总结了现有的情绪注释策略和相应的计算任务,然后描述了MER中的主要挑战。此外,我们还提出了一些有代表性的方法,包括每种情感模式的表征学习、不同情感模式的特征融合、MER分类器的优化以及MER的领域自适应。最后,我们概述了一些实际应用,并讨论了一些未来的方向。 摘要:Humans are emotional creatures. Multiple modalities are often involved when we express emotions, whether we do so explicitly (e.g., facial expression, speech) or implicitly (e.g., text, image). Enabling machines to have emotional intelligence, i.e., recognizing, interpreting, processing, and simulating emotions, is becoming increasingly important. In this tutorial, we discuss several key aspects of multi-modal emotion recognition (MER). We begin with a brief introduction on widely used emotion representation models and affective modalities. We then summarize existing emotion annotation strategies and corresponding computational tasks, followed by the description of main challenges in MER. Furthermore, we present some representative approaches on representation learning of each affective modality, feature fusion of different affective modalities, classifier optimization for MER, and domain adaptation for MER. Finally, we outline several real-world applications and discuss some future directions.

【67】 APObind: A Dataset of Ligand Unbound Protein Conformations for Machine Learning Applications in De Novo Drug Design 标题:APObind:用于机器学习的配体非结合蛋白构象数据集在de Novo药物设计中的应用 链接:https://arxiv.org/abs/2108.09926

作者:Rishal Aggarwal,Akash Gupta,U Deva Priyakumar 备注:The 2021 ICML Workshop on Computational Biology 摘要:蛋白质-配体复合物结构已被用于设计基准机器学习方法,执行与药物设计相关的重要任务,如受体结合位点检测、小分子对接和结合亲和力预测。然而,这些方法通常只针对蛋白质的配体结合(或全)构象进行训练,因此,当蛋白质结构处于其天然未结合构象(或apo)时,不能保证表现良好,这通常是新识别受体的构象。其主要原因是结合位点的局部结构通常随配体结合而改变。为了解决这个问题,我们提出了一个名为APObind的数据集,旨在提供PDBbind数据集中存在的蛋白质的apo构象,PDBbind数据集是药物设计中常用的数据集。此外,我们还探讨了特定于此数据集上三个用例的方法的性能,通过这些方法,我们证明了在APObind数据集上验证这些方法的重要性。 摘要:Protein-ligand complex structures have been utilised to design benchmark machine learning methods that perform important tasks related to drug design such as receptor binding site detection, small molecule docking and binding affinity prediction. However, these methods are usually trained on only ligand bound (or holo) conformations of the protein and therefore are not guaranteed to perform well when the protein structure is in its native unbound conformation (or apo), which is usually the conformation available for a newly identified receptor. A primary reason for this is that the local structure of the binding site usually changes upon ligand binding. To facilitate solutions for this problem, we propose a dataset called APObind that aims to provide apo conformations of proteins present in the PDBbind dataset, a popular dataset used in drug design. Furthermore, we explore the performance of methods specific to three use cases on this dataset, through which, the importance of validating them on the APObind dataset is demonstrated.

【68】 Electroencephalogram Signal Processing with Independent Component Analysis and Cognitive Stress Classification using Convolutional Neural Networks 标题:基于独立分量分析的脑电信号处理和基于卷积神经网络的认知应激分类 链接:https://arxiv.org/abs/2108.09817

作者:Venkatakrishnan Sutharsan,Alagappan Swaminathan,Saisrinivasan Ramachandran,Madan Kumar Lakshmanan,Balaji Mahadevan 机构: Dept. of Electrical and Electronics Engineering, SSN College of Engineering, India, CSIR - Central Electronics Engineering Research Institute, Pilani, Rajasthan, India 备注:16 pages, 10 figures, 2 tables, 8 equations, 16 references 摘要:脑电图(EEG)是由于生物电信号的活动而产生的记录,这些生物电信号是从头皮上的电极获得的。在脑电图信号(EEG)记录中,获得的信号主要受到眼电信号(EOG)的污染。由于与EEG信号相比,该伪影具有更高的幅度,因此必须去除这些噪声信号,以便更好地了解人脑在医学诊断等应用中的功能。提出了一种将独立分量分析(ICA)与互相关相结合的脑电信号去噪方法。这是通过基于具有阈值的互相关系数选择分量来实现的,并减少其影响,而不是将其完全归零,从而减少信息损失。记录数据的实验结果表明,该算法能够在脑电数据损失小的情况下消除EOG信号伪影。通过增加信噪比值和降低互相关系数值来验证去噪效果。去噪后的信号用于训练人工神经网络(ANN),该网络将检查输入EEG信号的特征,并预测个体的应激水平。 摘要:Electroencephalogram (EEG) is the recording which is the result due to the activity of bio-electrical signals that is acquired from electrodes placed on the scalp. In Electroencephalogram signal(EEG) recordings, the signals obtained are contaminated predominantly by the Electrooculogram(EOG) signal. Since this artifact has higher magnitude compared to EEG signals, these noise signals have to be removed in order to have a better understanding regarding the functioning of a human brain for applications such as medical diagnosis. This paper proposes an idea of using Independent Component Analysis(ICA) along with cross-correlation to de-noise EEG signal. This is done by selecting the component based on the cross-correlation coefficient with a threshold value and reducing its effect instead of zeroing it out completely, thus reducing the information loss. The results of the recorded data show that this algorithm can eliminate the EOG signal artifact with little loss in EEG data. The denoising is verified by an increase in SNR value and the decrease in cross-correlation coefficient value. The denoised signals are used to train an Artificial Neural Network(ANN) which would examine the features of the input EEG signal and predict the stress levels of the individual.

【69】 A generalized forecasting solution to enable future insights of COVID-19 at sub-national level resolutions 标题:一种通用的预测解决方案,使将来能够在国家以下级别的分辨率上洞察冠状病毒 链接:https://arxiv.org/abs/2108.09556

作者:Umar Marikkar,Harshana Weligampola,Rumali Perera,Jameel Hassan,Suren Sritharan,Gihan Jayatilaka,Roshan Godaliyadda,Vijitha Herath,Parakrama Ekanayake,Janaka Ekanayake,Anuruddhika Rathnayake,Samath Dharmaratne 机构:Dharmaratne, University of Peradeniya, Sri Lanka, Sri Lanka Technological Campus, Padukka, Sri Lanka, Postgraduate Institute of Medicine, University of Colombo, Sri Lanka, �These authors contributed equally to this work. 摘要:2019冠状病毒疾病继续对公众健康产生重大影响。为了最大限度地减少这种影响,决策者采取遏制措施,但是,当采取的措施与实际威胁不成比例时,如果进行错误的威胁评估,就会造成不良的长期社会经济并发症。此外,宏观层面或国家层面的决策未能考虑小区域的局部敏感性。因此,需要通过2019冠状病毒疾病的行为提供区域性威胁评估,通过准确的预测。2019冠状病毒疾病的预测,在本研究中,提出了预测方案,以预测在CYVID-19的日常新的情况下,在足够小的区域,可以在本地实施遏制措施,针对三个主要缺点,在文献中存在;2019冠状病毒疾病的不平衡性、预测模型的弱部署能力、预测模型在先前看不见的区域中的预测能力弱、模型训练偏差等因素造成的数据不可靠。因此,本研究的贡献有三个方面;一种优化的平滑技术,用于根据该区域的流行病学动态对不太确定的epi曲线进行平滑,一种基于长-短记忆(LSTM)的预测模型,使用来自选定区域的数据进行训练,以创建一个具有代表性和多样性的训练集,最大限度地提高在缺乏历史数据的区域的部署能力,以及一个自适应损失函数,同时进行训练,以缓解epi曲线中出现的数据不平衡。提出的平滑技术、广义训练策略和自适应损失函数极大地提高了预测的总体精度,从而在更局部的微观层面上实现了有效的遏制措施。 摘要:COVID-19 continues to cause a significant impact on public health. To minimize this impact, policy makers undertake containment measures that however, when carried out disproportionately to the actual threat, as a result if errorneous threat assessment, cause undesirable long-term socio-economic complications. In addition, macro-level or national level decision making fails to consider the localized sensitivities in small regions. Hence, the need arises for region-wise threat assessments that provide insights on the behaviour of COVID-19 through time, enabled through accurate forecasts. In this study, a forecasting solution is proposed, to predict daily new cases of COVID-19 in regions small enough where containment measures could be locally implemented, by targeting three main shortcomings that exist in literature; the unreliability of existing data caused by inconsistent testing patterns in smaller regions, weak deploy-ability of forecasting models towards predicting cases in previously unseen regions, and model training biases caused by the imbalanced nature of data in COVID-19 epi-curves. Hence, the contributions of this study are three-fold; an optimized smoothing technique to smoothen less deterministic epi-curves based on epidemiological dynamics of that region, a Long-Short-Term-Memory (LSTM) based forecasting model trained using data from select regions to create a representative and diverse training set that maximizes deploy-ability in regions with lack of historical data, and an adaptive loss function whilst training to mitigate the data imbalances seen in epi-curves. The proposed smoothing technique, the generalized training strategy and the adaptive loss function largely increased the overall accuracy of the forecast, which enables efficient containment measures at a more localized micro-level.

【70】 Temporally Nonstationary Component Analysis; Application to Noninvasive Fetal Electrocardiogram Extraction 标题:时间非平稳分量分析在无创胎儿心电图提取中的应用 链接:https://arxiv.org/abs/2108.09353

作者:Fahimeh Jamshidian-Tehrani,Reza Sameni,Christian Jutten 机构: Sameni are with the School of Electrical& Computer Engineering, Shiraz University, Universit´e Grenoble Alpes 备注:9 pages and 5 figures 摘要:目的:时间非平稳信号的混合在生物医学应用中非常常见。源信号的非平稳性可以作为信号分离的一种鉴别特性。本文提出了一种半盲源分离算法,用于从信号和噪声的线性多通道混合信号中提取时间非平稳分量。方法:提出一种假设检验方法,用于检测和融合时间上的非平稳事件,通过使用特设指标来监测创新过程的一阶和二阶统计量。作为概念证明,使用两种类型的非平稳检测器,通过从母体腹部获取的无创胎儿心脏记录,通过公开可用的数据集,定制和测试通用框架:1)局部功率变化检测器,2)使用扩展卡尔曼滤波器的新息过程特性的模型偏差检测器。结果:在白噪声和有色噪声的存在下,在不同信噪比下,评估了所提出方法的性能。结论和意义:所提出的方案是通用的,可用于从多变量数据中的假定模型中提取非平稳事件和样本偏差,这是许多机器学习应用中经常遇到的问题。 摘要:Objective: Mixtures of temporally nonstationary signals are very common in biomedical applications. The nonstationarity of the source signals can be used as a discriminative property for signal separation. Herein, a semi-blind source separation algorithm is proposed for the extraction of temporally nonstationary components from linear multichannel mixtures of signals and noises. Methods: A hypothesis test is proposed for the detection and fusion of temporally nonstationary events, by using ad hoc indexes for monitoring the first and second order statistics of the innovation process. As proof of concept, the general framework is customized and tested over noninvasive fetal cardiac recordings acquired from the maternal abdomen, over publicly available datasets, using two types of nonstationarity detectors: 1) a local power variations detector, and 2) a model-deviations detector using the innovation process properties of an extended Kalman filter. Results: The performance of the proposed method is assessed in presence of white and colored noise, in different signal-to-noise ratios. Conclusion and Significance: The proposed scheme is general and it can be used for the extraction of nonstationary events and sample deviations from a presumed model in multivariate data, which is a recurrent problem in many machine learning applications.

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2021-08-24,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 arXiv每日学术速递 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档