Deep Learning and Reinforcement Learning
Poster No. Title Poster Movie

Hierarchical control architecture for humanoid model

Koji Ishihara, Jun Morimoto


Although robotics technologies have rapidly improved and are widely used in industry and daily life, humanoid robots’ capability remains problematic and it is so inferior to that of human beings. Humanoid robots have been promoted for working instead of humans. However, it remains difficult for humanoid robots to generate human-like multi-joint movements. This problem can be attributed to a dynamics model used in a conventional control framework: a humanoid robot is modeled as a significantly reduced system such as an inverted pendulum model and motions are generated based on the reduced model. Since such low-dimensional model explains only a part of robot dynamics, it significantly restricts generable movements. To cope with this problem, we propose a hierarchical control framework which has two-level controllers: a high-level controller based on optimal control and a low-level controller based on a reflex-based, neuro-inspired controller. Unlike the conventional control framework, the high-level controller decides whole-body motion plans of a humanoid robot under full-body dynamics model. The planned motions are executed in a low-level controller while preventing the robot from fatal falls by using the reflex mechanism. Since to plan whole-body motion takes a relatively long period, it is hard to rapidly react unknown effects like disturbances. The reflex mechanism improves balancing capability of the humanoid robot.


Riemannian geometry in machine learning and data analysis

Minh Ha Quang


Traditional machine learning and data analysis methods often assume that the input data can be represented by vectors in Euclidean space. While this assumption has worked well for many applications, researchers have increasingly realized that if the data is intrinsically non-Euclidean, ignoring this geometrical structure can lead to suboptimal results. Thus many geometrical methods have been proposed in Machine Learning, Statistics and applications. Our work focuses on Riemannian Geometrical Methods for Covariance Matrices and Covariance Operators, which have found numerous applications in Brain Imaging, Brain Computer Interfaces, and Computer Vision, to name just a few domains. In particular, our geometrical framework for Covariance Operators exploits both Riemannian Geometry and Kernel Methods and represents some of the latest developments in this research direction, both mathematically and algorithmically, with promising practically performance. The theoretical formalism will be illustrated by numerical examples.


Parallel deep reinforcement learning with model-free and model-based methods

Eiji Uchibe


Reinforcement learning can be categorized into model-based methods that exploit an (estimated) environmental model, and model-free methods that directly learn a policy through the interaction with the environment. To improve learning efficiency, we have proposed CRAIL, which dynamically selects a learning module from multiple heterogeneous modules according to learning performance while multiple modules are trained simultaneously. However, CRAIL does not consider model-based methods. This study extends CRAIL to deal with model-based and model-free methods and investigates whether dynamic switching between them contributes to the improvement of learning efficiency. The proposed method was evaluated by MuJoCo benchmark tasks. Experimental results show that a model-based method with a simple model was selected at the early stage of learning, and a model-based method with a complicated model was used at the later stage. Furthermore, model-free methods were selected when the network did not have sufficient capacity to represent the environmental dynamics.


Deep reinforcement learning approach to gliding locomotion

Tatsuya Nagase, Katsuhiko Inagaki, Tomoko Ozeki


In this paper, we apply deep reinforcement learning to the gliding locomotion of a leg-wheel robot. A leg-wheel robot was proposed to move a robot efficiently and stably and researches on leg-wheel robots that combine the structure of legs and wheels have been conducted. The movement by wheels is highly energy-efficient on the flat terrain and the speed of the movement is fast. Walking with legs is effective for the uneven terrain where it is difficult to drive with wheels. We have built a robot which has three passive wheels on its body and two legs. The robot has two RC servomotors which are controlled by PMW wave. One is for opening and closing of legs and the other is for changing the direction of wheels. The robot has a rotary encoder to observe the speed of the body and a Raspberry Pi for learning and sending command signals to each hardware. In previous studies, the robot was controlled by using periodical signals such as sinusoidal waves. However, the hand-coding must assume all possible control rules in the environment to accomplish the task. It is difficult to write a code for all if-then rules assuming the changes in friction coefficient and road surface irregularities in the environment. A robot with reinforcement learning can achieve the task by performing trial and error in the environment and collecting data by itself. In addition, since it is possible to acquire behaviours which are suitable for the environment, all we must do is to simply define rewards, which is an indicator of goodness of the action taken by the robot during the learning process. Deep reinforcement learning can be expected to reduce human effort and time compared to the hand-coding. In this paper, we have used Deep Q-Networks, where the states are given by the speed of the body and the angle of servomotor and the action is the angle of the servomotor of next step. The reward is set as the speed of the body. Starting from sinusoidal waves as an initial condition, the robot has started to acquire the appropriate gliding locomotion to proceed forward by DQN.


Text detection and recognition from natural scene using a portable USB device

Deepak Rai, Tomoko Ozeki


Recently, with the launch of high specs and portable edge technologies, internet of Things (IoT) field has more freedom. It has shifted from cloud processing to edge computing. The launch of cheap and portable processing units such as Google Coral USB accelerator lead to the foundation of making many standalone cost-efficient and automated devices. These devices have helped to grow many industrial applications, health care, retail, smart spaces and transportation sectors. In this project, we have built an automated and cost-efficient device for Natural Scene Text Detection and Recognition using Google’s Coral USB accelerator. Text detection and recognition from a natural scene are considered one of the most difficult and valuable challenges in the field of computer vision. Most of the existing methods treat text detection and recognition as two different tasks. For detection, they generally use CNN based model with multiple stages and components. Moreover, for text recognition, a sequential prediction is conducted one by one on the top of text regions. However, this results in the degradation of the overall performance. The overall performance is determined by the interplay of all the stages and components in the pipelines. Furthermore, it leads to heavy computations, especially for the complex image with the multiple text regions. In this work, we have solved this problem by using a highly simplified end-to-end trainable network. In the text detection branch, we have used the modified version of highly simplified FCN based EAST architecture, which contains two sub-branches for text instance segmentation and instance-level bounding box regression. Further, since there are a lot of small text boxes in the natural images, we have upscaled the feature maps from 1/32 to 1/8 size of the original input image. In the text recognition branch, text sequence information is encoded using CNN and LSTM. These technical improvements help in removing many intermediate stages from our pipeline, making it very simple. This simplified architecture reduces the loss of information while going through multiple steps. We have experimented with ICDAR2015, COCO-Text, MLT and Total-Text datasets. Finally, we have converted 32-bit parameter data of Tensorflow model into 8-bit representations using Quantization aware training to deploy it on Google Coral USB. In the end, we have integrated Coral USB with RaspberryPI.


A new method to infer the structure of neural networks

Mohamed Boubakour, Timothée Leleu, Kazuyuki Aihara


A way to get experimental data about neural activity is to use Multi Electrode Array (MEA). Thanks to MEA one can measure spike activity with good precision in a sample of few hundreds neurons. From this data neuroscientist try to infer the connectivity between neurons, for that they assume that the connectivity can be approximated with second order correlations. But this method is very limited since it neglects higher order correlations and assume that connectivity between neurons are symmetric. We developed a new method taking account higher order correlations and asymmetries in the interactions. For that we assume the spiking activity can be modeled with common models like leak integrate and fire (LIF). We then look at the phase space of a neural network and the dynamics of the network can be represented as mapping transformation of the phase space. In the end we get the final shape (stationary behavior) of the phase space that can be dived into partitions. Each partition corresponds to a specific event i.e a pattern where some neurons are spiking for some duration. By computing the volume of each partition we can determine the probability to see some neurons to spike and from that we can determine the mean number of spikes for each neurons. In simple case we showed that this mean number is directly related to the connectivities. We also realized that this method can be generalized to more sophisticated model like Izhikevich and simulations showed that our method has better precision than usual methods.


Self-supervised learning with energy-based spiking neural networks

Kotaro Sakamoto, Hiroya Masuoka


Energy-based generative neural networks learn probability distributions of training data in the form of energy-based models (EBMs) where the energy functions are parameterised by neural networks. Training such EBMs is known to be challenging. Most popular artificial neural networks use a simplified model of neurons. Spiking neural networks (SNNs) are a more biologically-plausible framework with time-dependent spiking neurons. SNNs are also not easy to be trained. In the present research, we hybridise EBMs and SNNs to boost learning efficiency and ability. In a simple set of numerical experiments, we show that EBMs parametrised by SNNs are faster with respect to the learning convergence and they showed robust and diverse patterns of learned representations. In addition, we investigated its ability of modelling complex behaviours.


Learning memory-dependent continuous control from demonstrations

Siqing Hou, Dongqi Han, Jun Tani


Efficient exploration has presented a long-standing challenge in reinforcement learning, especially when rewards are sparse. Demonstrations can be used to guide explorations in these cases. However, existing methods are not applicable to most real-world robotic controlling problems because they assume that environments follow Markov decision processes (MDP); thus, they do not extend to partially observable environments where historical observations are necessary for decision making. This paper builds on the idea of replaying demonstrations for memory-dependent continuous control, by proposing a novel algorithm, Recurrent Actor-Critic with Demonstration and Experience Replay (READER). Experiments involving several memory-crucial continuous control tasks reveal significantly reduce interactions with the environment using our method with a reasonably small number of demonstration samples. The algorithm also shows better sample efficiency and learning capabilities than a baseline reinforcement learning algorithm for memory-based control from demonstrations.


Brain-like unsupervised feature learning for convolutional neural networks

Takashi Shinozaki


Since the advent of the convolutional neural network (CNN) [LeCun et al., 1989; Krizhevsky et al., 2012], the importance of brain-like information processing in artificial intelligence has become much more important. On the other hand, there is ongoing debate about whether back propagation (BP) learning, which is the main learning method of CNN, is sufficiently brain-like or not. In fact, the DNN learned by BP learning requires much more layers than the brain, which is a problem in associating the dynamics of the CNN with the brain reaction. In order to solve this problem, we propose to apply a brain-like learning method called competitive learning to CNN. Competitive learning is a traditional unsupervised learning method for neural networks used in Neocognitron [Fukushima, 1980] and self organizing map (SOM) [Kohonen, 1982], and models the dynamics of the visual cortices of the brain. We applied competitive learning to a large-scale neural network in recent years and examined its effectiveness. We constructed a multi-layer feature extractor that realizes brain-like information processing by using a simple activation function called winner-takes-all (WTA) instead of rectified linear unit (ReLU) that requires BP signals. In the discrimination experiment using ImageNet dataset, we performed unsupervised feature learning by the proposed method and then applied error-based learning only to the final layer. As a result, we acquired a large number of features stably overwhelmingly than BP learning, and achieved state-of-the-art accuracy as a biologically plausible neural network. By using this brain-like unsupervised feature learning, it becomes easier to compare brain activity with artificial neural networks, and it is speculated that a new method incorporating more brain-like methods will be developed.


Maximizing transfer entropy in agent-based models

Ekaterina Sangati, Georgina Montserrat Reséndiz-Benhumea, Federico Sangati, Alexey Yudin and Tom Froese


Several measures have been proposed in recent neuroscientific and simulation work for capturing neural complexity. Some have focused on the density of information transfer between different parts of the system (Tononi et al., 1994), others on the amount of entropy produced by the system (Lynn et al., 2020). Both measures have been found to be correlated with consciousness and performing more cognitively demanding tasks in human studies. Both have also been used in simulation studies to drive guided self-organization of agents that efficiently solve a variety of tasks.In this work we compare the properties of the two measures by means of a simplified example using simulated two-dimensional data, as well as through an agent-based simulation based on a social interaction task. The latter builds on previous work by Candadai et al. (2019) that have shown that evolving agents that maximize entropy production of their neural system leads to more complex neural activity when they are placed in the context of social interaction but not when they are evolved in isolation. In this work we evolve the agents to maximize pairwise transfer entropy instead and examine the differences in the resulting behavioral and neural patterns, compared to the original model.

World Model Learning and Inference
Poster No. Title Poster Movie

Choose a good option or avoid a bad option: D2-MSN in the NAc selectively contributes to the strategy to avoid a bad option under decision

Tadaaki Nishioka, Takatoshi Hikida


To optimize decision making, animals need to execute not only a strategy to choose a good option but also one to avoid a bad option. Positive reinforcement learning promotes an action associated with a positive outcome, and negative reinforcement learning suppresses an action associated with a negative outcome, these types of learning are important for executing effective behavioral strategies. A number of studies using computational neural circuit models and animal experiments have proposed that dopamine signaling is implicated in reinforcement learning, but it has been poorly understood how the neural substrates targeted by dopamine contributes to reinforcement learning and execution a strategy under decision. In this study, we focused on the nucleus accumbens which receives dopaminergic innervation from the ventral tegmental area (VTA) and tested how manipulating the two population of NAc projection neurons, dopamine D1 receptor-expressing medium spiny neurons (D1-MSN) and D2-MSN, affects executing a strategy under decision. Here we expressed inhibitory DREADD, hM4Di, in the D1-MSN or D2-MSN of the NAc and investigated the effect of suppressing the neuronal activities of D1-MSN and D2-MSN in the NAc on the performance of a visual discrimination task. To quantitatively assess the neural mechanisms underlying a strategy to choose a good option or one to avoid a bad option, we have developed two novel visual discrimination task, the Visual Discrimination-based Positive Reinforcement Learning (VD-PRL), in which animals need to acquire a strategy to choose a good option, and the Visual Discrimination-based Negative Reinforcement Learning (VD-NRL), in which animals need to acquire a strategy to avoid a bad option. We found that D2-MSN in the NAc selectively contributes to the strategy to avoid a bad option under decision.


Elucidation of neural mechanism underlying model-based decision making using optogenetic manipulation

Yu Ohmura


Adapting to changing environmental conditions requires a prospective inference of future actions and their consequences, a strategy also known as model-based decision making. In stable environments, extensive experience of actions and their consequences lead to a shift from a model-based to a model-free strategy; whereby behavioral selection is primarily governed by retrospective experiences of positive and negative outcomes. Human and animal studies, where subjects are required to speculate about implicit information and adjust behavioral responses over multiple sessions, point to a role for the central serotonergic system in model-based decision making. However, to directly test a causal relationship between serotonergic activity and model-based decision making, phase-specific manipulations of serotonergic activity are needed in a one-shot test; where learning by trial and error is neutralized. Moreover, the serotonergic origin responsible for this effect is yet to be determined. Herein, we demonstrate that optogenetic silencing of serotonin neurons in the dorsal raphe nucleus, but not in the median raphe nucleus, disrupts model-based decision making in lithium-induced outcome devaluation tasks. Our findings provide insights into the neural mechanisms underlying neural weighting between model-free and model-based strategies.


Fast somatosensory feedback control is regulated by uncertainty in body state estimates, but not by target uncertainty

Sho Ito, Hiroaki Gomi


To execute precise movement, our brain constantly corrects ongoing motor commands dependent on dynamically changing body states. Such feedback control is partially provided via reflexes as well as voluntary control. As previous studies have shown, reflex responses are functionally modulated depending on goal and contexts [Kimura et al., 2006; Scott et al., 2015]. In particular, online flexible modulations of the reflexive response suggests that state estimation process including multisensory integration underlies reflex control. Indeed, we have previously shown attenuation of stretch reflex, involuntary muscle contraction driven by proprioceptive input, under distortion or elimination of visual cursor [Ito and Gomi, 2020], which suggests contribution of visual feedback to the gain tuning of the stretch reflex. In the present study, to further clarify how visual information is processed to regulate the stretch reflex, we examined the effect of visual uncertainty on the stretch reflex by manipulating visibility of cursor and/or target. In the experiment, human participants were asked to flex their wrist from a start position to a visual target which located randomly in each trial. While the visual target and a visual cursor representing the hand position was fully provided in baseline trials, the visual information was displayed partially in two types of test trials. In Cursor-off trial, the visual cursor was eliminated immediately after hand movement started, to introduce uncertainty in hand state. In Target-off trials, the visual target was present only initially for 100[ms], and then disappeared until hand movement stopped to inflict uncertainty in the target location. We found significant increase in variability of movement endpoints in both Cursor-off and Target-off trials, suggesting that uncertainty of both hand- and target-state degraded the movement precision. In addition, we evaluated intensity of reflexive feedback response. In randomly selected trials, a mechanical perturbation was applied to the wrist in the middle of the hand movement to evoke the stretch reflex of wrist flexor muscle. As in the previous research, amplitude of long-latency stretch reflex significantly decreased in Cursor-off trials compare to baseline trials. In contrast, the amplitude of the stretch reflex did not differ in Target-off trials. The results suggest that the attenuation of the stretch reflex was caused by uncertain estimates of hand state but not by ambiguity of target information. Our finding supports a hypothesis that reflex is regulated considering uncertainty of body state estimate which is constantly updated using visual feedback.


Evaluating double articulation analyzer with human magnetoencephalography

Takeshi Kishiyama, Yohei Oseki, Tadahiro Taniguchi


Natural language processing (a branch of artificial intelligence) and the neurobiology of language (a branch of brain science) have been traditionally divorced. In natural language processing, on the one hand, computational bases of language have been developed under the shadow of deep learning techniques, but the question of how those computational bases are biologically realized in the human brain was not sufficiently addressed. In the neurobiology of language, on the other, neural bases of language have been revealed thanks to neuroimaging techniques, but the perspective on how those neural bases are algorithmically implemented with neural computations was largely neglected. However, despite being proposed relatively independently, those computational and neural bases show striking resemblance in that both constitute complex networks of various modules, so that the happy marriage of the two fields is highly desirable. For this purpose, in this paper, we will investigate computational and neural bases of language by constructing neurocomputational models based on symbolic automata and evaluating them with neurophysiological measurements of human magnetoencephalography (MEG). Specifically, (i) the nonparametric Bayesian double articulation analyzer (NPB-DAA; Taniguchi et al., 2016) was trained unsupervisedly on the Corpus of Spontaneous Japanese (CSJ; Maekawa, 2003), (ii) MEG responses to naturalistic speech were recorded continuously with a 306-channel Vectorview whole-head MEG system (Elekta Ltd., Helsinki, Finland) at the National Rehabilitation Center for Persons with Disabilities (NRCD), and finally (iii) the trained NPB-DAA was tested statistically against the recorded MEG responses via information-theoretic complexity metrics (Shannon, 1948) and spatiotemporal cluster permutation regression (Maris & Oostenveld, 2007). The results of this neurocomputational modeling will be presented, and their implications for artificial intelligence and brain science of language will be discussed (Oseki & Taniguchi, in preparation).


Effect of null-space variability on skill learning in a multidimensional motor task: analysis with hierarchical Bayesian modeling

Lucas Rebelo Dal’Bello, Jun Izawa


While the variability of motor commands has been typically seen as just noise that should be reduced, recent evidence suggests that it contributes to exploration to find the best motor commands during the motor learning task. However, even though the human motor apparatus is highly redundant, this contribution of motor variability was examined only for the task-relevant space, and thus little is known about the effect of null space variability on motor learning. Besides, these effects might be different depending on which learning algorithm, either error-based learning or reinforcement learning, is employed by the subject in the task. Here, we developed hierarchical Bayesian models of motor memory update with separate contributions of error-based learning and reinforcement learning, and we fitted these models to the data acquired from subjects during a multidimensional motor task. Thirteen subjects wore a ten degrees-of-freedom data glove and were instructed to use hand gestures to control the direction of a cursor on a screen. The redundancy of the task was implemented as two perpendicular control spaces, a task-potent space (mapped to the screen) and a null space (hidden from subjects), composed of perpendicular hand gestures, such that the same direction in the screen could be reached using multiple gestures. We measured single trial adaptation (in both control spaces) to instantaneous, pseudorandom visuomotor rotations on the displayed cursor. We found that the addition of the reinforcement learning component in the fitted models significantly improves their fitness, compared with models with error-based learning alone, and that the model with best fit combined both error-based and reinforcement learning contributions. Although simulations of both learning algorithms indicated that variability on the task-potent space correlated more with the learning rates of both components, we found that the learning rates correlated more with the amount of variability on the null space instead. This indicates that, while the task-potent space is exploited, and its variability is reduced, the null space is actively explored, and such exploration strongly influences the learning rate on the task-potent space. These results suggest that null space variability could be a good indicator of learning rate, and therefore manipulating such variability could possibly lead to a better control of this learning rate, which could be useful in the context of motor rehabilitation.


An open-world simulator for the developmental robotics

SM Mazharul Islam, Aishwarya Pothula, Md Ashaduzzaman Rubel Mondol, Deokgun Park


Despite recent progress in application-specific models, we are yet to witness models with human-like intelligence. To tackle this challenge, self-supervised learning is emerging where conventional norms such as highly curated domain-specific data with labeled ground-truth or application-specific environments with extrinsic rewards might not provide a suitable ground. A simulated environment where a learning agent can go through experiences similar to a human infant would promote self-supervised learning and the generation of a hierarchical representation of the surrounding environment while reducing bias towards learning explicitly defined tasks. We introduce SEDRo, a simulated environment that allows a learning agent to have similar experiences that a human infant goes through from the fetus till 12 months of age. The sensory and motor capabilities of the learning agent are progressively enhanced as it reaches the developmental stages. For example, during the fetus stage, an agent will only receive tactile and proprioceptor stimuli. Similarly, for the first three months after birth, the agent is nearsighted. Initially, the motor outputs will be insufficient to support self-movement (e.g., sitting-up, crawling, and walking), but the steady increment in strength will make these movements possible at later stages. Interactable objects such as toys will respond to the agent’s behavior using a physics simulator provided by the Unity 3D game engine. To simulate social learning, we have designed a mother/caregiver character modeled after real infant-caregiver interactions. The caregiver character can initiate actions or respond to the agent’s behaviors by randomly choosing from an action library. As the agent cannot follow verbal instructions yet, established psychological experiments for the evaluation of human babies will be used as benchmarks. One such experiment is the visual expectation paradigm, in which a baby attends to different aspects of the visual scene as it grows. Recorded responses from the agent will be compared with responses from human infants of the same developmental stage as a sign of intelligence development. We hope that researchers in the AI community can use SEDRo to develop and test models for human-like intelligence.


Goal-directed planning for habituated agents by active inference using a variational recurrent neural network

Takazumi Matsumoto, Jun Tani


It is crucial to ask how agents can achieve goals by generating action plans using only partial models of the world acquired through habituated sensory-motor experiences. Although many existing robotics studies use a forward model framework, there are generalization issues with high degrees of freedom. This work shows that the predictive coding (PC) and active inference (AIF) frameworks, which employ a generative model, can develop better generalization by learning a prior distribution in a low dimensional latent state space representing probabilistic structures extracted from well habituated sensory-motor trajectories. In our proposed model, learning is carried out by inferring optimal latent variables as well as synaptic weights for maximizing the evidence lower bound, while goal-directed planning is accomplished by inferring latent variables for maximizing the estimated lower bound. Our proposed model was evaluated with both simple and complex robotic tasks in simulation, which demonstrated sufficient generalization in learning with limited training data by setting an intermediate value for a regularization coefficient. Furthermore, comparative simulation results show that the proposed model outperforms a conventional forward model in goal-directed planning, due to the learned prior confining the search of motor plans within the range of habituated trajectories.


Encoding and decoding of auditory cortex during perceptual decision making

Akihiro Funamizu, Fred Marbach, Anthony M Zador


Sound responses in auditory cortex are modulated by many task variables including sound statistics, task engagement, spectral attention and others. To drive behavior, the modulated neural representations must be decoded by the downstream areas to which it projects. Here we investigate two questions about the decoding of auditory cortical neurons. First, we test whether noise in the cortical representation of auditory stimuli limits the performance of an animal performing an auditory discrimination task. Second, we test how downstream brain areas can decode neural representations in the auditory cortex if those representations are themselves changing by task variables. To address these questions, we developed a two-alternative choice auditory task for head-fixed mice in which we varied either reward expectation (by varying the amount of reward, in blocks of 200 trials) or stimulus expectation (by varying the probability of different stimuli). The task was based on our previous study (Marbach and Zador, bioRxiv, 2016) in which mice selected left or right spout depending on the frequency of tone stimuli (low or high). We used two-photon calcium imaging to record populations of neurons in auditory cortex while mice performed the task. We found that varying either reward or stimulus expectation changed neural representations (i.e. stimulus encoding). Sound decoding with the activity of one or a small number of auditory cortical neurons matched the behavior performance of mice on a trial-by-trial basis, suggesting that cortical noise did not limit the performance of the animal during this task. The sound stimuli encoded by auditory cortex were reliably and stably decoded by downstream areas, even when the encoding is modulated by behavioral context. By contrast, neither context nor choice could be reliably decoded from the activity, implying that the animal’s decisions depend on the integration of information represented outside of auditory cortex. Our results suggest that stimuli encoded by auditory cortex can be reliably read out by downstream areas, even when the encoding is modulated by task-relevant contingencies.


Can extracted features by machine learning be quantified in terms of physics?

Shotaro Shiba Funai


Machine learning can learn various data by extracting features of the data and transforming them into the distributed expressions. This process can be interpreted as a kind of data compression, and some researchers naively say that it should be related to the coarse-graining. In statistical physics, the coarse-graining is formulated as the renormalization. Especially, the iterative process of coarse-graining is described as the renormalization flow, and its behavior can be characterized by the phase transitions, such as water/ice or ferromagnetic/paramagnetic transitions. In this poster, we concentrate on unsupervised learning (autoencoder with restricted Boltzmann machine or convolutional neural network) so that any dependence on making arbitrary labels for training data can be avoided. For training data, we use images of simple systems in physics, such as Ising model and Gaussian matrix model, where various quantities can be analytically calculated. The configurations of these systems can be described as black-and-white or gray-scale images, so we use them for machine learning of image recognition. Then we discuss the correspondence of the physical quantities and the distributed expressions or reconstructed data in unsupervised learning. Of course, the correspondence depends on the hyperparameters of our neural networks, however, we can also find general properties of how the extracted features in our unsupervised learnings are related to the renormalization and thermodynamic properties in statistical physics. This study would provide some guidelines to determine values of hyperparameters, and moreover, to understand why machine learning works well.


Investigation of agential interactions based on frameworks of predictive coding and active inference: A simulation study with pseudo-imitative-interaction between human and robot

Wataru Ohata, Jun Tani


In social interaction, sometimes agents cooperate by sharing their intention to gain mutual benefits, and other times they conflict with each other by following their intention. Although how such dynamics in social interaction could emerge is not apparent, investigation of dynamic characteristics of agency in social interaction settings might shed light on the underlying mechanism. Particularly, the sense of agency (SoA) refers to the congruence between the agent’s intention in action and the outcome. To explore the SoA in social interaction from the perspective of a computational model, predictive coding (PC) and active inference (AIF) provide a theoretical framework for perception and action generation of the agent. Focusing on a mathematical property of a model based on PC and AIF, we hypothesized that regulation of the complexity of the agent’s model should affect the strength of agency in social interaction. To evaluate the hypothesis, we built a computational model of an agent with proprioceptive and visual sensation with a variational Bayes recurrent neural network. We tested the model in the form of pseudo imitative interaction between a robot and a human in which the robot interacts with recorded human body movement data. A vital feature of the model is that the complexity of each modality, as well as of the entire network, can be regulated independently by changing the values of a hyperparameter of the network. We investigated how the values of the hyperparameter affect the strength of the agency during the interaction. We found that with weaker regulation of complexity, the robot tends to move more egocentrically, without adapting to the human counterpart. On the other hand, with stronger regulation, the robot tends to follow ith human counterpart by adapting its internal state. Our study concludes that the strength with which complexity is regulated significantly affects the nature of dynamic interactions between individual agents in a social setting.

Metacognition and Metalearning
Poster No. Title Poster Movie

Vision-based speedometer regulates implicit adjustments of walking speed

Shinya Takamuku, Hiroaki Gomi


When we drive a car or walk on a sideway, the translational motion of our body can be inferred from vision. Series of studies have revealed how the brain estimates the direction of self-motion from optic flow. Meanwhile, it remains unclear whether and how it calculates the translational velocity. Since retinal motion velocity depends both on the self-motion velocity and the depth structure of the scene, a critical issue is whether the brain is capable of disambiguating the two factors. In this regard, behavioural studies have produced mixed results. Furthermore, no study to our knowledge has revealed a distance-invariant speed coding based on vision in the primate brain. Here, to address the issue, we examined a highly optimized human visuomotor response which automatically estimates and maintains the walking speed to its most energy-efficient range. Participants wore a binocular head-mounted display and walked inside a virtual corridor of which the side-walls occasionally moved either forward or backwards. The change in walking velocity, caused by the wall motion, was evaluated as the response. We found, repeatedly, that the response was invariant to the distance of the side-walls even though the retinal-motion velocity differed dramatically among conditions. The invariance was not explained by temporal-frequency based coding, previously suggested to underlie constancy of speed perception. Meanwhile, it broke down, when the interocular distance was virtually manipulated such that the near-walls appeared far and vice versa. It also broke down when the structure of the corridor was built such that monocular depth cues become deceptive in a similar manner. Our findings reveal a speedometer in the human brain that calculates self-motion velocity from vision on the fly by integrating depth and motion cues.


Optimization of visuomotor learning speed by reinforcement meta-learning

Taisei Sugiyama, Nicolas Schweighofer, Jun Izawa


Both humans and animals show acceleration of learning in various tasks and environments, yet its underlying mechanism is unknown. In machine learning, reinforcement leaning of learning speed has been developed as an automatic tuning of learning parameters to maximize total rewards over multiple sessions. Learning to learn by rewards, i.e., reinforcement meta-learning, is also considered in the brain without empirical examinations. Here, we implemented a reinforcement meta-learning problem in a visuomotor learning task where the speed of motor learning influenced subsequent scores that were associated with monetary compensation. In the task, participants first observed a sensory prediction error during an arm-shooting movement in one trial, and the amount of change in the movement direction (i.e., learning from the error) was evaluated with a score in subsequent trials. Thus, the task included a Markov decision process where an observation of error (state) and the speed of learning (policy) determined the amount of learning (action) and thereby a score (reward), and an optimization of policy was to change the speed of learning to maximize score. We tested two conditions with opposite optimal policies by manipulating the relationship between learning and score. Specifically, the participants could earn a large score or avoid a big loss by faster learning in one condition and by slower learning in the other condition. We demonstrated that motor learning was accelerated over time when fast learning yielded better score compared to when slower learning yielded better score (p = .0002), and the effect appeared as robust as a previously reported bias in that loss of score led to faster learning than gain (p = .0009). Also, the observed change in learning speed was generalized transiently to a typical visuomotor learning task without score (p = .007), further supporting that the participants changed the speed of motor learning. Thus, they regulated the speed of motor learning according to the optimal policies in the task, suggesting that they optimized learning speed by reinforcement meta-learning. Because research evidence suggests involvement of the basal ganglia in reinforcement learning and the cerebellum in sensory error-based motor learning, we suggest that interactions between the two brain regions play a key role in optimizing the parameter of motor learning through reinforcement learning.


Adaptive continuous timescale recurrent neural networks for cognitive modelling

Stefan Heinrich, Shingo Murata, Yuichi Yamashita, Yukie Nagai


The information processing in the human brain, which underlies many cognitive functions, is governed by both, spatial as well as temporal mechanism. For temporal mechanisms, however, our understanding is particularly scarce because of the limits in inspecting the brains complex activity on high spatial and temporal resolution. Nevertheless, intriguing hypotheses have been suggested about the involvement of oscillations in neural populations, local integration by mode coupling, inherent temporal hierarchies, and different intrinsic timescales in the brain’s activity. As a method to verify these hypotheses, predict and interpolate effects, and explain individual differences in information processing, cognitive modelling can be used. In particular, artificial neural network models that capture different timescale characteristics from synthetic as well as from behavioural data can provide for detailed inspection of representation formation and the emergence of signifying activation patterns. In our contribution to the symposium, we present cognitive models with timescale parameters that adapt to temporal characteristics of sensory input and show properties of hierarchical compositionality and modulation. Our models are based on the biological plausible Continuous Timescale Recurrent Neural Network architecture and can learn the individual neurons’ leakage of information and thus a complex interplay of the neurons’ activation on multiple timescales. We show how these models can explain the brain’s adaptiveness and structure extraction in analysable synthetic tasks as well as in cognitive prediction tasks. Also, we demonstrate how these models can be utilised to explain individual differences, such as between children and adults, or between typically developed people and people with psychiatric symptoms including autism spectrum conditions or schizophrenia. We want to discuss, how our models can help to better understand these conditions but perhaps also allow for building training tools that can alleviate the effects of the psychiatric conditions in the long run.


Learning what to communicate between agents for monitoring smart homes

Masoumeh Heidari Kapourchali, Bonny Banerjee


Recognition of daily activities of individuals living in a home is of interest in many applications such as health monitoring, home automation and surveillance. A recognition system can inform a caregiver or police if abnormal activities are detected. We propose a smart home monitoring agent that can actively sample other agents via communication, where each agent monitors its own home with unique residents, layout and sensor types. Since each agent’s observations are unique, the knowledge base regarding activities in smart homes is distributed among these agents. An agent’s purpose of communication is efficient knowledge acquisition by sampling this distributed knowledge base. We propose an agent model in the predictive coding framework (Friston et al., 2012; Kapourchali & Banerjee, 2020) that can learn a communication policy for what to communicate in an online manner, without any supervision or reinforcement. A communication policy allows knowledge transfer using a common vocabulary of words involving functionality of sensors, their location and time of day when triggered. The policy allows an agent to select the words that may be communicated in each situation which leads to efficient knowledge transfer. Our model is evaluated on two publicly-available benchmark datasets, collected from a house and an apartment (Van Kasteren et al., 2010). The house has six rooms in two floors with 21 sensors. The apartment has two rooms with 23 sensors. The residents in the house and apartment are 57- and 28-years old male, respectively. Each home is assumed to be monitored by an independent agent. One agent is trained to recognize activities from the house by observing its sensor data. This knowledge is transferred to the other agent monitoring the apartment. Our experiments show that the agents can transfer knowledge by communicating the most informative messages. The messages are interpretable. Accuracy and F-score of our agent in inferring the activities are 77.52% and 83%, respectively which are higher than traditional transfer learning models for the same task.


Multiscale computation and dynamic attention in biological and artificial intelligence

Ryan Paul Badman, Thomas Trenholm Hills, Rei Akaishi


Biological and artificial intelligence (AI) are often defined by their capacity to achieve a hierarchy of short-term and long-term goals that require incorporating information over time and space at both local and global scales. More advanced forms of this capacity involve the adaptive modulation of integration across scales, which resolve computational inefficiency and explore-exploit dilemmas at the same time. Research in neuroscience and AI have both made progress towards understanding architectures that achieve this. Insight into biological computations come from phenomena such as decision inertia, habit formation, information search, risky choice and foraging. Across these domains, the brain is equipped with mechanisms (such as the dorsal anterior cingulate and dorsolateral prefrontal cortex) that can represent and modulate across scales both with top-down control processes and by local to global consolidation as information progresses from sensory to prefrontal areas. Paralleling these biological architectures, progress in AI is marked by innovations in dynamic multiscale modulation, moving from recurrent and convolutional neural networks—with fixed scalings—to attention, Transformers, dynamic convolutions, and consciousness priors—which modulate scale to input and increase scale breadth. The use and development of these multiscale innovations in robotic agents, game AI, and natural language processing (NLP) are pushing the boundaries of AI achievements. By juxtaposing biological and artificial intelligence, the present work underscores the critical importance of multiscale processing to general intelligence as well as highlighting innovations and differences between the future of biological and artificial intelligence.


Homeostatic recognition circuits emulating network-wide bursting and surprise

Tsvi Achler


Understanding the circuits of recognition is essential to build a deeper understanding of virtually all of the brains behaviors and circuits. The goal of this work is to capture simultaneous findings on both the neural and behavioral levels, namely network wide-bursting with surprise (unexpected inputs) dynamics, using a hypothesized recognition circuit based on the idea of homeostasis flow. If real neural brains at a resting state are presented with an unexpected or new stimulus, the brain network shows a fast network-wide increase in activation (network-wide bursting of many neurons) followed by a slower inhibition, until the network settles again to a resting state. Bursting phenomena during recognition is found ubiquitously in virtually every type of organism, within isolated brain dissections and even neural tissue grown in a dish. Its source and function remain poorly understood. Behavioral manifestation of surprise can be observed if the input is much unexpected and may involve multiple brain regions. The homeostatic flow model posits that activation from inputs is balanced with top down pre-synaptic regulatory feedback from output neurons. Information is projected from inputs to outputs with forward connections then back to inputs with backwards homeostatic connections which inhibits the inputs. This effectively, acts to balance the inputs and outputs (homeostasis) and generates an internal error-dependent input. This homeostatic input is then projected again to outputs and back again until output values relate recognition. This occurs during recognition and no weights are learned. When a surprise or unexpected input stimulus is presented, network-wide bursting occurs because the homeostatic balance is disturbed with the new stimulus. The system subsequently calms down as it settles back to a new homeostasis.

Poster Movie

Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks

Dongqi Han, Kenji Doya, Jun Tani


Recurrent neural networks (RNNs) for reinforcement learning (RL) have shown distinct advantages, e.g., solving memory-dependent tasks and meta-learning. However, little effort has been spent on improving RNN architectures and on understanding the underlying neural mechanisms for performance gain. In this paper, we propose a novel, multiple-timescale, stochastic RNN for RL. Empirical results show that the network can autonomously learn to abstract sub-goals and can self-develop an action hierarchy using internal dynamics in a challenging continuous control task. Furthermore, we show that the self-developed compositionality of the network enhances faster re-learning when adapting to a new task that is a re-composition of previously learned sub-goals, than when starting from scratch. We also found that improved performance can be achieved when neural activities are subject to stochastic rather than deterministic dynamics.

Poster Movie

Beneficial roles for chaotic variability in learning systems

Matthew Farrell, Stefano Recanatesi, Timothy Moore, Guillaume Lajoie, Eric Shea-Brown


Neural responses are highly variable, even under identical task conditions. Significant efforts are being directed toward explaining how the brain copes with and may even leverage such variability to help learn the task and environment. Here we explore the issue in a recurrent neural network model that is trained to classify inputs. We find two potential beneficial roles for chaotic variability in these systems: (1) chaos can accelerate the flexible relearning of a task after it is modified; and (2) chaos can lift the network representation of data into a higher-dimensional space, which allows the network to classify inputs embedded in low-dimensional spaces.


How morphological growth can participate in the acquisition of locomotion behavior in robots

Fabien C. Y. Benureau, Jun Tani


Development and learning are cornerstone processes of intelligent life. While they operate at different timescales, they interact with one another in non-trivial ways. We will show one example of such an interaction between morphological growth and the acquisition of a locomotion behavior in simulated soft robots. In particular, we will show that when a robot does grow up, from a small birth size to a larger adult size, with a gradual increase in motor strength and mass, it tends to acquire better walking behavior than a robot that only learns as an adult. Since most robots today have a fixed morphology, and when they learn, only learn in their adult form, this result may have a broad impact on how to design future robots and how to improve their learning processes. This result also adds to the evidence regarding the importance of morphological development in animals and humans, and provides an experimental framework to test further hypotheses about this phenomenon.


Whole brain reference architecture to evaluate biological plausibility of human-like artificial intelligence

Hiroshi Yamakawa (The Whole Brain Architecture Initiative / The University of Tokyo / RIKEN BDR), Naoya Arakawa(The Whole Brain Architecture Initiative), Koichi Takahashi (RIKEN BDR)


In recent years, properly combined systems of machine learning modules have been able to solve a variety of intelligent tasks individually. Therefore, there is a growing interest in designing architectures for achieving more general intelligence. If the desired AGI is limited to the human level, the design space could be limited by using neuroscience knowledge. In order to develop such a brain-inspired AGI, techniques are needed to evaluate the plausibility of the implemented software compared to the neuroscience facts (e.g., anatomical/physiological findings) of the whole brain. In order to cross the gap from neuroscientific facts and evaluate the biological validity of implementations by machine learning, a bridge is needed in the form of a hypothesis about computational function (functional hypothesis) grounded in neuroscientific facts. Therefore, we had started to build the Whole Brain Reference Architecture (WBRA) as a knowledge base for describing functional hypotheses across the whole brain. WBRA is written by a standard format called the Brain Information Flow (BIF). In the BIF format, we adopted the information flow format as a structural basis that can be shared between biological neural circuits and artificial neural networks. In the BIF format, various subcircuits in the brain are described as nodes, and the starting points of connections between subcircuits are described using axon sets projected from neurons of the same subtype as basic units. In the construction of WBRA, the structure of information flow is determined first, and then the functional hypothesis is assigned to it. Although the data, such as the connectome for structure, is growing, it has not yet reached a sufficient level to describe the BIF. Therefore, we are currently constructing a structure by interpreting the descriptions of the relevant papers. Furthermore, the construction of the functional hypothesis proceeds in such a way that the functional part-whole relationship corresponds to the part-whole relationship of the information flow structure. At present, the researcher gives a functional hypothesis to a structure with less than a dozen basic units to be considered, by interpreting the findings in the literature. In the future, we would like to complete a somewhat comprehensive WBRA and build a software that is a prototype of the brain-inspired AGI with the ability to solve a certain variety of tasks with reference to the WBRA. Then, we would like to explore the architecture of AGI at the human level by modifying the software in various ways within the constraints given by WBRA. However, there are difficulties in automatically assessing the biological validity of software from functional hypotheses written in natural language, and overcoming these difficulties is one of the key issues within the future architecture search technology.


Construction of a whole brain reference architecture (WBRA)

Mei Sasaki, Naoya Arakawa, Hiroshi Yamakawa


In this poster, we report on the recent progress on the construction of a Whole Brain Reference Architecture (WBRA), a database for registering hypotheses about brain function together with experimentally obtained anatomical and physiological findings in Brain-Information-Flow (BIF) format. We plan to make WBRA public in the near future. WBRA is different from existing databases in that it is not for registering microscopic structures of the brain, but rather, it is a database for contributors to freely register functional hypotheses of the brain, that are written in unstructured format in individual papers and rarely managed in a unified format. Possible applications for WBRA include some reference source for development of artificial general intelligence (AGI), evaluation of biological validity, and computational psychiatry. Roughly speaking, the current ontology for BIF format consists of following classes/properties. From an anatomical viewpoint, there are “Circuit” class to represent circuits in the brain, and “Connection” class to represent connections between circuits. The minimum possible unit in terms of BIF is the “Uniform circuit” class, which represents a group of neurons with the same subtype. It is required such that the starting point of a “Connection” be a “Uniform circuit”. Therefore, “Connection” corresponds to an axon set projected from a neuron group of the same subtype. There is a property called “transmitter” to represent output nerve cells and “modType” to represent excitability or inhibitory property based on physiological knowledge. In addition, there is “functionality” property for expressing Circuit’s functional hypothesis / Connection’s semantic hypothesis, which is a feature of this database, and “reference” property for registering a reference paper for each hypothesis. Currently, “functionality” property stores functions of the brain for a specific brain component in natural language text. There is a property to store OBO id to reference existing databases as well. We plan to make this data public, and also provide a visualization tool for demonstrating parts of the brain and their functionalities. In the presentation, we plan to share current procedure to construct WBRA, fundamental difficulties in the project, some statistics about data in WBRA, and estimated coverage of data in WBRA.

AI for Neuroscience and Neuromorphic Technologies
Poster No. Title Poster Movie

Understanding cognitive impairment in neurodevelopmental disorders as an imbalance between stochastic and deterministic dynamics in neural networks

Takafumi Soda, Ahmadreza Ahmadi, Jun Tani, Mikio Hoshino, Manabu Honda, Takashi Hanakawa, Yuichi Yamashita


According to the predictive coding framework, altered estimation of uncertainty is considered as a key concept in understanding the pathological mechanisms of neurodevelopmental disorders. However, the relationship between the development of dynamical neural systems and altered estimation of uncertainty remains unclear. In the present study, we hypothesized that altered uncertainty estimation and the resulting cognitive impairment in developmental disorders can be understood as an imbalance between the stochastic and deterministic representations of neural systems. This hypothesis was tested using a predictive-coding-inspired variational recurrent neural network (PV-RNN) in which the balance of the stochastic and deterministic representations in the learning process was controlled by a weighting factor (WF) for the complexity term in the free energy minimization process. We also tested the impact of the length of the learning period (LP). The task for the PV-RNN was to learn and reproduce two-dimensional temporal sequences imitating reaching movements with temporal and spatial variations. PV-RNN was trained with varying the value of the WF and LP. Training results were evaluated by the reproduction of task sequences, adaptation to novel sequences, and uncertainty estimation. As a result of the training, networks with low-WF successfully reproduced training sequences with a certain level of uncertainty, though with some prediction errors, indicating that the variations of sequences were reproduced using stochastic dynamics. On the other hand, in the regenerations of the sequences by high-WF networks, the prediction errors and uncertainty estimates were extremely small, which suggests that deterministic representation was used. Low-WF networks presented better adaptability to novel sequences than high-WF networks, possibly because the latter networks used deterministic dynamics that disturbed the correction of their predictions based on prediction errors. Similar results were obtained with respect to LP, namely, deterministic dynamics with long-LP networks and stochastic networks with short-LP networks, respectively. These results demonstrated that altered uncertainty estimation and reduction of adaptability to novel situations were closely related to the imbalance between the stochastic and deterministic representations, suggesting that the proposed idea might contribute to understanding the cognitive impairment in neurodevelopmental disorders.


Functional magnetic resonance imaging analysis using deep neural networks to elucidate impairments of information processing in psychiatric disorders

Akio Murakami (Medical Institute of Developmental Disabilities Research, Showa University), Kei Majima(Graduate School of Informatics, Kyoto University), Hidehiko Takahashi (Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University)


In this poster presentation, we present our two recent attempts to elucidate impairments of information processing in psychiatric disorders by functional magnetic resonance imaging (fMRI) analysis with deep neural networks (DNNs). In the first study, we measured fMRI brain activity from schizophrenia patients and controls while they judged the categories of the objects in incomplete visual stimuli (perceptual decision making task). To detect fMRI signals related with visual completion (i.e., top-down modulation), we used a convolutional neural network (CNN) pre-trained for visual object recognition. The fMRI responses to incomplete images were translated into the unit activations (i.e., visual features) in the CNN by multivariate decoding analysis. We examined how much and which level of visual features were modulated in visual completion, and compared it between the two groups. In the second study, we used a recurrent neural network (RNN) model to characterize the difference in brain activity between pathological gambling patients and controls. In the fMRI experiment for this study, subjects performed a gambling task in which the subjects were required to adaptively change their strategy to obtain larger reward (reversal learning task). An RNN model was also trained so that it would optimally perform the same gambling task. By quantifying the similarity between unit activations of the RNN and the fMRI brain activity of each subject, we identified which task-related parameters are encoded in the brain activity in a data-driven manner. In both of the studies, our neural encoding and decoding analyses using DNNs successfully captured the difference of fMRI brain activity between the patients and controls, which could not be detected by conventional univariate fMRI analysis. These results suggest that DNN-based fMRI analysis can be a powerful approach to reveal neural mechanisms of psychiatric disorders.


Endomicroscopic calcium imaging from the macaque primary visual cortex

Mineki Oguchi, Jiang Jiasen, Toshihide W Yoshioka, Yasuhiro Tanaka, Kenichi Inoue, Masahiko Takada, Takefumi Kikusui, Kensaku Nomoto, Masamichi Sakagami


Macaque monkeys have long been used as a neuroscientific model to understand neural circuits underlying sensory, motor, and cognitive processes that are considerably relevant to the human kind. In vivo calcium imaging with genetically encoded indicators has recently started to apply to the macaque brain to monitor neural activities from a large population of cells simultaneously. Two-photon microscopy was used in these studies to observe neural activities from cortical surface. An alternative is microendoscopic calcium imaging combined with implantable gradient index (GRIN) lenses, which enables us to image neural activities from deeper layers with a compact and handy setup. The application of endomicroscopic calcium imaging, however, has been mainly restricted to the rodents and the marmoset so far. Here we developed techniques to image neural activities using a miniature fluorescent microscopy with prism lenses from the primary visual cortex (V1) in the macaque brain. We injected AAV2.1-GCaMP6s and implanted prism lenses into the bilateral V1 of three macaque brains. During imaging, the monkeys performed a fixation task with a brief presentation of a Gabor patch with six different orientations in the periphery. We succeeded to obtain clear fluorescent signal responding to visual stimuli from four out of six hemispheres. Although some of these fluorescent signals were come from crowds of neurons out of focus, we found tens of clear neuronal activities with cellular-level resolution. A subset of these neurons (and crowds of neurons) with fluorescent signal change have their own receptive fields and various orientation selectivity, which is consistent with well-known retinotopic map and orientation tuning of the macaque V1. These results demonstrate that microendoscopic calcium imaging is a feasible and reasonable tool to investigate neural circuits in the macaque brain by enabling us to monitor fluorescent signals from a large number of neurons.


Foveal and peripheral visual fields offer distinct representations and generalization in motor learning for reaching movements

Naotoshi Abekawa,Hiroaki Gomi


We can learn multiple motor skills that are engaged in different behavioral contexts. To facilitate learning, each context may link with different internal states that are inherently represented in sensorimotor processing. Meanwhile, the brain computes the spatial coordination between gaze and a reach target during motor planning. Interestingly, an fMRI study (Prado et al., 2005) showed neural activities in different cortical areas for hand-reaching to foveal versus peripheral targets. These reports enabled us to propose a further hypothesis: learning of multiple motor skills is facilitated when each motor memory is linked with different eye-hand coordination. To test this hypothesis, we conducted learning experiments of conflicting visuomotor rotation and of conflicting force-field. In both experiments, participants reached a target while looking at the target (Foveal reach: FOV) or at elsewhere (Peripheral reach: PER). The direction of perturbation of visuomotor rotation or velocity dependent force-field (CW and CCW) varied randomly across trials, but was coupled with FOV and PER trials. As a result of learning, reaching errors decreased, and then aftereffects were clearly observed, suggesting that the brain can form and retrieve distinct internal models of kinematic and dynamic control that are linked to the different gaze-reach coordination. We further examined generalization performances using visuomotor rotations. For a foveal-reach adaptation condition, we found that adaptation performance dropped steeply for the neighboring untrained fixation points, suggesting a small generalization. Interestingly, for a peripheral-reach adaptation condition, adaptation performance held in the same hemi-field, but dropped in the foveal and opposite hemi-field. Our results suggest that foveal and peripheral reach are separately represented, and such separation can form distinct motor memories. The present study opens up new avenue for understanding how the brain represents neural states for eye-hand coordination and how the brain utilizes such representations for motor learning.


Brain responses to tactile oddballs are modulated by illusory pulling direction

Jack De Havas, Sho Ito, Hiroaki Gomi


Asymmetric vibration at the fingertips causes an illusory sensation that the arm is being pulled. The neural basis of this pulling effect is not known. Oddball tasks are widely used in neuroscience to determine when stimuli are categorized and to probe the neural effects of context. We therefore used a somatosensory oddball task and EEG to detect electrophysiological markers of the illusory pulling sensation. The task used 100ms bursts of vibration separated by ~1000ms. Participants (n = 15) experienced blocks of leftward asymmetric (left pull), rightward asymmetric (right pull) and symmetric (Neutral pull) trials. For each block, 80% of the trials were in the common direction, while 20% were in the other two directions (oddball stimuli). We hypothesized the pulling direction would act as a neural context. Specifically, that oddball responses would be enhanced when uncommon stimuli evoked a pulling sensation opposite to that of the common stimuli. In support of this, we found that opposite direction oddball stimuli evoked a larger central P3 response (260 – 460ms), than those evoked by the same oddballs in the context of neutral stimuli. Moreover, when the neutral stimuli instead acted as oddballs, they produced a similarly enhanced P3 response, indicating that P3 differences were caused by the pulling direction context, not low-level stimulus differences. The results suggest illusory pulling sensations arise and are classified within 260ms of stimulus onset and that the cortical regions that generate the pulling sensation are sensitive to the pulling direction context.


Statistical relationship between visual motion and self-motion characterizes spatiotemporal frequency tuning of implicit responses of eye and hand induced by visual motion

Daiki Nakamura, Hiroaki Gomi (NTT Communication Science Laboratories)


Visual motion is one of the important cues to detect unexpected body fluctuations. Many previous studies showed that visual motion has crucial roles for quickly adjusting posture, eyes, and limbs in dynamic interaction with environments. Particularly, ocular and manual following responses (OFR [Miles, Kawano 1986] and MFR [Saijo et al. 2005]) are induced strongly by low spatial and high temporal frequency stimuli (under 0.1 cpd and over 10 Hz) [Gomi et al. 2006]. In this study, we synthetically examined why the tunings of OFR and MFR peak at low spatial and high temporal frequencies even though motion perception is sensitive at higher spatial and lower temporal frequencies. To examine this question, we assumed that the OFR and MFR are compensatory reactions against self-motion, then we developed a convolutional neural network (CNN) which estimates translational and rotational velocities of self-motion from monocular sequential images recorded by a head-mounted camera during several kinds of human natural movements. Spatiotemporal frequency tuning of the trained CNN peaked at low spatial and high temporal frequency, as observed in the tunings of OFR and MFR. Interestingly, the tuning peak of the CNN was shifted by training datasets artificially modified. Specifically, the tuning peak was shifted to lower temporal and higher spatial frequencies by a limited head-velocity dataset, and was shifted to higher temporal and lower spatial frequencies by a lower frame sampling dataset. This result implies that the tunings of OFR and MFR are acquired by the statistical relationship between visual images and self-motions in natural behaviors, rather than by the statistics of individual modality. We also examined ecological validity of the internal representation of CNN by comparing its kernel activation properties with neural activation properties of the visual cortex. While the motion direction selectivity was stronger in MT neurons than in V1 neurons [Wang & Movshon, 2016], we found that the convolution kernels in deeper layers of the CNN had stronger direction selectivity. In addition, phase selective kernels were mainly found in 1st and 2nd layers of the CNN, whereas the kernels in more deeper layers showed weak phase dependent responses. This difference is found in electrophysiological study that V1 simple cells were phase dependent whereas V1 complex cells showed phase invariant feature [Hubel & Wiesel, 1962].


Development of social behavior analysis for monkeys in the natural environment

Riza Rae Pineda, Takatomi Kubo, Satoshi Murashige, Masaki Shimada, Kazushi Ikeda


Behavioral studies of nonhuman primates involve the observation of interactions and relationships between organisms in the natural environment. This field contributes to the understanding of evolutionary processes and biological origins of characteristics that promote socialization among animals including humans. In the observation of non-human primates, scientists model social interactions between individuals. In 2014, Shimada and Sueur analyzed the role of social play for infant and juvenile chimpanzees in Tanzania by designing the partial play networks and partial association networks of the individuals in the sample. Primarily, biologists manually collect and analyze their data to generate behavior models. The team of Schofield et al. used cameras to record interactions between wild chimpanzees over 14 years and collected a total of 50 hours worth of video clips. Manual tasks such as data pre-processing and denoising, tracking, behavior, and species classification, and generating features from a dataset of images have been eased by machine learning techniques. Schofield et al. employ the power of CNNs to identify the faces of the monkey subjects and determine their sex. The emergence of and developments in adversarial networks have paved the way for the improvement in image recovery and reconstruction. Despite these, common computer vision problems, such as occlusion handling and recovery, are still ongoing challenges that scientists and engineers attempt to address. With the overall goal of developing a robust and automated system for monkey social behavior analysis, this study will mainly focus on activity classification and the detection of turning points where social interactions become positive or favorable for the individuals involved. We will also explore solving consequent issues such as occlusion handling and motion prediction to improve our system. We aim for our research to unload scientists from the manual and repetitive tasks and to aid them in the analysis of monkey behavior. We will present our idea of implementation and preliminary results.


Real-time visual coding with neural spikes

Qi Xu, Jiangrong Shen, Zhaofei Yu, Jian K. Liu


Neural coding, including encoding and decoding, is one of the key problems in the brain-machine interface for understanding how the brain uses neural signals to relate sensory perception and motor behaviors with neural systems. Moreover, it is also the cornerstone for building a robust pattern reconstruction for controlling the physical devices interplayed with neural signals. However, most of the existed studies only aim at dealing with the analogy signal of neural systems, while lacking a unique feature of biological neurons, termed spike, which is the fundamental information unit for neural computation as well as a building block for neuromorphic computing. Aiming at these limitations, we propose a robust pattern reconstruction model named deep spike-to-pattern decoder to reconstruct multi-modal stimuli from the event-driven nature of spikes. Using about 5% of information represented in terms of spikes, the proposed model can not only feasibly and accurately reconstruct dynamical visual and auditory scenes, but also rebuild the stimulus patterns from fMRI brain activities. We demonstrate our method can achieve state-of-the-art performance. Importantly, it has a superb ability of noise-immunity for various types of artificial noises and background signals. The proposed framework provides efficient ways to perform multimodal feature representation and reconstruction in a high-throughput fashion, with potential usage for efficient neuromorphic computing in a noisy environment.

Poster Movie

Hierarchical structure of speech rhythm in autism spectrum disorder

Tatsuya Daikoku, Shinichiro Kumagaya, Satsuki Ayaya, Yukie Nagai


Speech rhythm comprises of a nested hierarchical structure of phonological units including prosody, syllable and phoneme. This hierarchy can be identified from speech waveforms: The prosodic, syllabic and phonetic rhythms are represented in the amplitude modulation (AM) hierarchy at delta (1-4 Hz), theta (4-8 Hz) and beta/gamma (12-30 Hz) bands, respectively. The AM hierarchy of speech sound has also a relatively straightforward neurophysiological interpretation. The delta, theta and beta/gamma oscillators in the auditory cortex couple with prosodic, syllabic, and phonetic phases respectively, underpinning each of phonological intelligibility. A body of studies has suggested that autistic individuals have specific characteristics of the speech rhythm, particularly in the prosodic domain. However, it remains unclear how the hierarchical structure of speech rhythm can be represented in the autistic speech sound, and how it is interacted with neural oscillatory processing in the autistic brain. The present study, as a first step, examined the AM hierarchy of autistic speech rhythm, using two complementary modeling approaches: a modeling step for the cochlear filterbank in the brain, and probabilistic amplitude demodulation based on Bayesian inference with no adjustments for the brain. The filterbank approaches simulate the frequency decomposition by the cochlea, but partially introduce the artificial modulations into the sound waveform because the bandpass filters can introduce modulations near the center-frequency of the filter through ringing. In contrast, the probabilistic amplitude demodulation models statistical structure of natural sound without filtering, by inferring the modulators and a carrier based on Bayesian inference. The results suggested that the different types of models (filtering vs. probabilistic) demonstrated similar AM hierarchy of speech rhythm. Further, the prosodic rhythm interacts with the other domains such as syllabic rhythm. The present findings indicate that it is important to comprehensively understand a hierarchical structure of speech rhythm and mutual relationships between hierarchies as well as each phonological domain. Further study is, however, necessary to examine how such hierarchical structure of speech rhythm could be interacted with a nested hierarchy of neural oscillation in the autistic brain. This study will shed a light on interdisciplinary understanding how autistic individuals perceive and produce speech sound.


Quantifying developmental differences in drawing ability using a convolutional neural network

Anja Philippsen, Sho Tsuji, Yukie Nagai


Children’s drawing ability develops gradually with increasing age – a process that is accompanied by immense improvements in cognitive ability. Since decades, drawing studies are, thus, used as a tool to investigate the cognition of young children. An important open question is how children integrate sensory perceptions with their own predictions. Adults are suggested to perform this integration in an optimal way [Friston & Kiebel 2009], but the development of this process is unknown to date. Insights into this issue may be provided by investigating children’s representational drawing ability [Saito et al. 2014, Ford & Rees 2008]. In particular, modifications of the presented stimuli (i.e. bottom-up modifications) of the drawing task can reveal how strongly children rely on their own knowledge and how much they are affected by the bottom-up signal. Here, we propose an extension of the spontaneous picture completion task from [Saito et al. 2014] to investigate how children adjust their drawings to bottom-up modifications of the stimuli. We combine traditional analysis using an adult rating study with a quantitive analysis of the drawings using a pretrained deep convolutional neural network [Simonyan & Zisserman 2014] that additionally can analyze the hierarchical structure of the drawings. Previous studies applied neural networks for analyzing pictures that children drew of specific objects [Long et al. 2018, 2019]. Here, we modify the stimuli in a bottom-up way by varying, for example, the category of the presented stimuli. In line with previous studies, we find that the representation of different objects in children becomes more similar to adult’s representations with increasing age. Furthermore, the results indicate that different drawing styles which children showed (e.g. scribbling, coloring or meaningful completion) lead to distinct activations in the network, in particular, in the higher, fully-connected layers of the network. Based on these findings, we quantify the developmental change and individual differences in the drawings. The results demonstrate that our methodology, which uses bottom-up modifications instead of top-down instructions has the potential to quantify drawings of children, which may be useful particularly when testing children with limited speech capabilities such as very young children or children with developmental disorders.


Two-stage sleep in cephalopods

Aditi Pophale, Sam Reiter


Sleep has been observed in every animal in which it has been examined, from humans to jellyfish. However, only certain vertebrates (mammals, birds, reptiles, and fish) have been shown to possess two stages of sleep, that is, Slow-Wave sleep (SWS) and Rapid Eye Movement sleep (REMS). The biological function(s) of two-stage sleep remain largely unknown, with proposals including thermoregulation, neural maintenance, memory consolidation and mental simulation. Coleoid cephalopods (octopus, cuttlefish, and squid) are a group of mollusks which, uniquely amongst invertebrates, evolved large brains and complex behaviors. Here, we demonstrate that cephalopod sleep is characterized by two stages, exhibiting many similarities to SWS and REMS in vertebrates. We recorded octopus (Octopus laqueus) behavior in laboratory settings continuously over days. To study their behavior quantitatively, we trained deep neural networks (Mask R-CNN) to segment the animal, and to distinguish particular body parts of interest (e.g. the eye). These animals spend most of the daylight in shelters that they actively fortify with rocks. Within their shelter they spend many hours immobile, eyes partially closed, breathing slowly and adopting a pale white color. This state is accompanied by an increase in arousal threshold and is rapidly reversible when the animal is sufficiently disturbed, suggesting that it constitutes octopus sleep. We are currently testing whether this state is under homeostatic regulation. Approximately every hour while asleep, octopus undergo a dramatic change in behavior. Their skin rapidly cycles through a range of colors and textures, their arms move erratically, their breathing increases, and their eyes move, open, and close. These episodes last for minutes, and differ in their fine structure. After each episode, the octopus returns to normal sleep behavior. We observe a similarly regular cycle between quiescence and rapid body, eye, and skin pattern movements in a cuttlefish (Sepia pharaonis). This suggests that two-stage sleep may be widespread amongst cephalopods, and that it must have appeared before the evolutionary divergence of cuttlefish and octopus, ~276 million years ago. The behavioral similarities between two-stage sleep in cephalopods and vertebrates, despite these groups being separated by ~600 million years of evolution, suggests a fundamental relationship between two-stage sleep and neural complexity.


Feature extraction for schizophrenia brain image using convolutional neural network

Hiroyuki Yamaguchi, Yuki Hashimoto, Genichi Sugihara, Jun Miyata, Toshiya Murai, Hidehiko Takahashi, Manabu Honda, Yuichi Yamashita


【Introduction】 The advancement of brain imaging techniques such as magnetic resonance imaging (MRI) has contributed to the detection of structural and functional abnormalities in the brain of schizophrenia. In recent psychiatric neuroimaging research, feature extraction plays an important role. The measurement based on region of interest (ROI) has been widely used as a feature extraction method. However, since the setting of ROI is arbitrarily defined, information originally included in voxel-level images might be overlooked. As a novel alternative method of feature extraction, deep neural networks, which can extract feature in a self-organized manner, has attracted attention. In this study, we investigated a feature extraction method for schizophrenia brain imaging using a deep convolutional neural network. 【Methods】 Voxel-level MRI images were used with the preprocessing, in which the region of gray matter was separated from each image and spatially standardized and smoothed. The three-dimensional convolutional auto encoder (3D-CAE) was trained to reproduce input MRI images using Kyoto University dataset consisted of 82 schizophrenia patients (SZ) and 90 healthy subjects (HS). By using the trained 3D-CAE, features for evaluations were extracted from Center for Biomedical Research Excellence (COBRE: 71 SZ and 71 HS) dataset. ROI-based features were determined as the average intensity values within each of the conventional 116 ROIs. The efficacy of the proposed method was evaluated in comparison with the ROI-based method using linear regressions for the prediction of demographic and clinical information, including age, severity of symptoms, and dose of medication. 【Results】 In the prediction of age, the performance of regression with 3D-CAE features was inferior to that with ROI-based features (p<0.001). On other hand, in the prediction of dose of medication, the 3D-CAE was superior to the ROI method (p<0.001). In the prediction of severity of positive symptoms, the 3D-CAE was slightly better than the ROI method, although no significant difference was found (p=0.09). In the severity of negative symptoms, there were no significant differences (p=0.9). 【Conclusions】 The proposed method of feature extraction using 3D-CAE outperformed conventional ROI-based method for predicting schizophrenia-related clinical information. This advantage may result from the characteristics of 3D-CAE that can recognize patterns of local morphological changes and its nonlinear combinations in the voxel-level MRI, consistent with the observations that several brain regions change in parallel in schizophrenia.


Low-dimensional representations of perceptual decision making across the mouse dorso-parietal cortex

Javier G. Orlandi, Mohammad Abdolrahmani, Ryo Aoki, Dmitry R. Lyamzin, Andrea Benucci


Perceptual decision making involves processing sensory information through feedforward and recurrent networks in cortical and subcortical brain regions. These networks are concurrently being driven by a multiplicity of other cognitive signals related to interactions with the environment, beliefs and past experiences. Hence, identifying how decision-related information propagates and interacts throughout cortical circuits in the presence of these other cognitive variables has proved to be challenging. We focused on perceptual decisions in mice trained in a two-alternative forced choice visual orientation discrimination task. We analyzed widefield fluorescent signals recorded during behavior from mice (n=8, Thy1-GCaMP6f) implanted with a cranial window over occipital-parietal areas, providing simultaneous access to 10-12 cortical areas. We used LocaNMF, a new method for decomposing spatially extended signals, to identify neural signatures of several behavioral states: attention, history-related components, body movements and ongoing decisions. These states defined a low-dimensional space in which we were able to analyze their interactions. A robust decision-axis could be defined in the absence of motor-related signals and a movement-related-axis appeared before any motor activity, denoting preparatory or choice-related information. The decision-axis was stable across different attentional levels and previous outcomes, but the magnitude by which different left-right choice trajectories separated across a trial depended on both attention and history. This separation also correlated with task-performance, all together suggesting an interaction between attentional and decision signals on a shared subspace. Since our choice of activity decomposition produces signals with spatially connected profiles, we were able to show that most behavioral states are better identified with distributed components, spanning several cortical areas. Some states, however, were more localized, like those related to motor signals. Our observations could be mapped and explained with a Recurrent Neural Network (RNN) model that receives simultaneously sensory and top-down signals (for attention and past outcomes). Interestingly, training the network with the animal choices (rather than with the optimal solutions) automatically yields the characteristic psychometric curves. An unsupervised identification of low-dimensional trajectories on the RNN activations correlated with the experimentally observed state trajectories, denoting a possible link between these brain and artificial representations. In conclusion, broad networks of dorsal-parietal areas, integrate sensory and choice variables in low-dimensional representations that are modulated by attentional and history variables, while retaining low dimensionality. This result might reflect a fundamental principle in computational systems that combine top-down and bottom-up signals within functional architectures with dense recurrent connectivity.


Knowledge representation for neural circuits subserving saccadic eye movement based on a Brain Information Flow description

Yoshimasa Tawatsuji, Naoya Arakawa, Hiroshi Yamakawa


In the field of cognitive neuroscience, brain functional imaging methods such as fMRI (functional Magnetic Resonance Imaging) and PET (Positron Emission Tomography) have been adopted for the grounding of cognitive functions and structures of the brain. However, cognitive tasks are not achieved only with a few of brain areas — the areas detected by the activation studies — involved but with various areas cooperated. Kitamura (2002) proposed extended device ontology, a framework for the structure of artifact and functions. This framework provides the perspective of the purpose that the artifact system is designed for, and the role that each part (structure) of the system achieves its subfunction with respect to the purpose of the system. Similarly, it is necessary (1) to decompose the function of the neural circuit (i.e., the achievement of the assumed target task) into subfunctions, and (2) to assign them neural structures in a manner consistent with neuroscientific facts. In particular, it is important to perform functional decomposition consistent with the structure of the brain because there are generally more than a unique functional decomposition. We have proposed a knowledge base (i.e. whole brain reference architecture) for describing the functional hypothesis over the whole brain as a standard, and knowledge description based on the brain information flow (BIF) format, where the information flow format is adopted to describe functional hypotheses of biological and artificial neural networks. In BIF formats, the structure is described as “”Circuit”” and “”Connection”” that represents the connection relationship between Circuits, and the Connection corresponds to the axon set projected from neurons ofthe same subtype. In this poster, we report knowledge description based on Brain Information Flow for neural circuits subserving saccadic eye movement (SEM). Previous studies have revealed that SEM was realized by different neural circuits in the horizontal and vertical directions. There is few systematic description of what kind of function subcircuits consisting SEM circuit have. Through the knowledge description of the neural circuit that realizes the SEM based on the BIF format, we systematically position the structural and functional knowledge. At present, it is in the stage where knowledge description about the superior colliculus and the brain stem is made. In the future, it is necessary to describe the relationship with the frontal eye area, basal ganglia, and hippocampus.


Coherent choice-direction representations between task epochs are supported by dynamic coordination of the perirhinal cortical neurons

Tomoya Ohnuki, Yuma Osako, Yoshio Sakurai, Junya Hirokawa


Cortical neurons show distinct firing patterns across multiple epochs of a given task, such as cue, action, and reward, which are characterized by different computational demands. Recent studies suggest that such distinct response patterns underlay dynamic population structures supporting computational flexibility. On the other hand, individual cortical neurons often show coherent response patterns across different epochs, suggesting their ability to form higher-order representations by integrating relevant information. Because of the difference in the explanations in these studies (i.e., population or single-neuron level), it is still elusive how such coherent single-neuron representations are reconciled with the dynamic population structure. To synthesize our understanding of these different hypotheses, we analyzed neural responses of the perirhinal cortex (PRC). Rats (n = 7) were trained in a two-alternative forced-choice task, where they chose a port (left/right) associated with a presented cue (visual/olfactory) to obtain a water reward. Spiking responses were recorded from the left PRC (n = 302 neurons). We found that the individual neurons encoded the choice directions regardless of the cue modalities in various time points in the trial duration. The choice-direction encodings peaked around two epochs, the cue epoch (−400 to 0 ms before decision for choice) and reward epoch (200 to 600 ms after choice). When the rats erroneously performed the task, the choice-direction encodings decreased in the reward epoch but not in the cue epoch, indicating different neural computations in these epochs. We found that many neurons encoded the opposite choice-directions between these epochs, suggesting a form of coherent single-neuron representations. By using principal component analysis, we identified neural subspaces associated with each of the epochs, which reflected coordinated patterns across the PRC neurons. The cue and reward epochs shared the primary neural dimensions where the choice directions were consistently discriminated. Interestingly, those dimensions were supported by moderately correlated weighting patterns across the neurons, and the individual neurons dynamically changed their contributions to the population between the epochs. These results suggested that individual neurons have unique computational roles and flexibly coordinate to support computations associated with different epochs even when they hold temporally coherent representations. 

Social Impact and Neuro-AI Ethics
Poster No. Title Poster Movie

Passive BCI for dementia onset detection and cognitive intervention monitoring

Tomasz M. Rutkowski, Masato S. Abe, and Mihoko Otake-Matsuura


The poster presents a practical application of machine learning (ML) in the so-called `AI for social good’ domain and in particular concerning the problem of a dementia onset prediction. An increase in dementia cases is producing a significant medical and economic weight in many countries. Approximately 47 million older adults live with a dementia spectrum of neurocognitive disorders, according to a recent report of the World Health Organization (WHO), and this number will triple within the thirty years. This growing problem calls for possible application of AI-based technologies to support early diagnostics for cognitive interventions and a subsequent mental wellbeing monitoring as well as maintenance with the so-called ‘digital-pharma’ or ‘beyond a pill’ therapeutical strategies. The poster presents our study results of EEG brainwave responses analysis in a BCI-based emotional stimulus and short-term implicit memory learning task. We focus on the advancement of digital biomarkers for dementia progress detection and monitoring. We discuss a range of machine-learning-resulting accuracies with encouraging results using classical shallow and deep learning approaches for automatic discrimination of normal cognition versus a mild cognitive impairment (MCI). The classifier input features consist of an older adult emotional valence and arousal EEG responses, as obtained from a group of 35 older adults participating voluntarily in the reported dementia biomarker development project. The presented results showcase the inherent social benefits of artificial intelligence (AI) utilization for the elderly and establish a step forward to advance machine learning (ML) approaches for the subsequent employment of simple BCI-based EEG examination for MCI and dementia onset diagnostics.

Poster Movie

Cognitive agent: Digital technologies for neuromarketing and fighting fake news

Yuriy Dyachenko, Oleksandra Humenna, Oleg Soloviov, Inna Skarga-Bandurova


Nowadays perspective of the creation of autonomous cognitive agents opens the way to the enriching of human-computer interaction by means of the building of intelligent assistants and automation of cognitive tasks. We suppose that the autonomous behavior of cognitive agents may be the consequence of a causal gap between physical processes and self-referential meaningful processing of information, which is related but not determined by physical processes. Cognitive representations as the agent’s subjective estimations are a base for indeterminism and unpredictability of an agent’s behavior. We propose to consider conceptual spaces as these cognitive representations. Conceptual spaces enable qualitative modeling of the quantitative subjective estimation. We suppose that the agent’s behavior is based on an agent’s subjective estimations in the conceptual space that are changed by means of perceptions and reasoning. Human activities affect the social (external) conceptual space. The latter can be influential on the agent’s (internal) conceptual space. These cognitive representations must be constantly reevaluated on the basis of the new data to defend against deception. This is the way to the understanding of human’s ability to make broad judgments and the possibility to understand the essence of things. To predict and evaluate the influences on cognitive spaces, we propose cognitive capital as an assessment of factors that can influence agent’s (internal) and social (external) senses, meanings, values, preferences, and as a result, utilities. The bidirectional transformation of these conceptual spaces could be a quantitative indicator of cognitive capital variations. This influence on the conceptual spaces possible due to digital, cognitive, and neuromarketing technologies as instruments to create, invest in, and manage cognitive capital assets. Considering that market interactions are based on the utilities, cognitive capital could be a measure of evaluating and development of a market structure. This relates to fighting against fake news as a component of the “marketing of statehood”. The relationship of the cost of creating and delivering impact (information and media) to outputs (benefits) can be considered as the indicator of the effectiveness of influencing on the conceptual spaces. Not ideologies or positions are fighting now, but virtual images that often drive users into information traps. For example, wide-spreading fake news in mass media (especially social networks) and political campaigns can be attributed to their (prior predicted) high effectiveness. These calculations of effectiveness should be taken for a decision about investments in the physical, human, social, or cognitive capital. This decision can be made on the basis of marginal productivity theory when the most effective investment strategies are based on the marginal product of every type of capital.


Use of machine-learning agent in social neuroscience: a case of functional magnetic resonance imaging experiment

Akitoshi Ogawa, Tatsuya Kameda


It is still ambitious to use machine-learning agents as counterparts in neuroscience researches for social interaction. Here, we introduce our case that used a machine-learning agent in a functional magnetic resonance imaging (fMRI) experiment to investigate how the brain regions associated with Theory-of-Mind were involved in the second-order inferences in competitive decision making. In our fMRI research, we scanned 30 right-handed participants that played an economic game of two competitive players in a 3T MRI scanner against three types of opponents. A human opponent outside the scanner played the game with the participant who was scanned (HUM condition). The other opponents were computer programs (FIX and LRN conditions). The choice probability of FIX was fixed at the equilibrium rate regardless of the participant’s choices. The LRN learned and predicted the participant’s choices using a machine-learning algorithm that was a perceptron with a sigmoid output function. The LRN considered the participant’s choices, its own choices, and the results of the most recent six trials, plus the average bias of the participant. Thus, the learning machine received 19 inputs (= 3×6+1) in each trial. We used the gradient descent of a mean square error to update the weights trial-by-trial. The learning rate was set to 0.25 throughout the experiment, which was determined by pilot tests. The choice data in the practice session was used to determine the initial weights. The behavioral result showed that the high choice rate in HUM condition was significantly correlated with that of LRN condition but not with that of FIX condition. The right temporoparietal junction (RTPJ) showed significantly higher activation in HUM condition than in computer conditions. The activation of left temporoparietal junction (LTPJ) was significantly correlated with the choices in the HUM and LRN conditions, but this relationship was weak in the FIX condition. The spatial pattern of LTPJ activity showed higher similarity between HUM and LRN conditions comparing with that between HUM and FIX conditions. These results indicate that the use of a machine-learning agent enabled us to reveal that the RTPJ is mainly associated with the perception of human agency, and the LTPJ is involved in second-order inferences (those about others’ inferences about one’s own beliefs) in a competitive game.


Rethought of ethical concerns on the identity as the human

Kazuhiko Shibuya


Our digitalized society calls the issues on ambiguity between AI and the humankind. It indicates ambiguous situations such as losing the boundary between self and the others, sense of depersonalization using VR, and the destabilized relationships between the human-being and the AI. Is there a possibility of dissolving the physical border between the human and the AI and keeping the AI as a harmonious partner with the human? Certainly, it is a necessary to reconsider the ethical rules of conduct among them, and it should prepare the harden barriers to draw the borderline against the unforeseen risks. For example, Kurzweil, R recently warned “the distinction between AI and humanity is already blurring.” And furthermore, “We’re merging with these non-biological technologies. We’re already on that path. I mean, this little mobile phone I’m carrying on my belt is not yet inside my physical body, but that’s an arbitrary distinction. It is part of who I am—not necessarily the phone itself, but the connection to the cloud and all the resources I can access there.” Hence, at first, it should obviously get in touch with traditional topics from philosophical realm such as epistemology, ontology, and ethics on the human. And next step, it leads further contemplation on the ethical conducts between the human and the AI. The AI studies have already interlinkages with those themes, and the author intends to deepen those issues for stepping further discussions.


Novel MRI-based geometric models for the quantification and prediction of morphometric changes in mild cognitive impairment converters

Hanna Lu, Jing Li, Li Zhang, Sandra Sau Man Chan, Linda Chiu Wa Lam


Background: The longitudinal global brain atrophy has been well documented. However, the ageing effect on region-specific cortical features in the context of brain atrophy and to what extent the measures could discriminate the individuals with different cognitive status are less investigated. We sought to quantify the ageing effect on the morphometry of left primary motor cortex (M1) and dorsolateral prefrontal cortex (DLPFC) in normal ageing adults and mild cognitive impairment (MCI) converters. Methods: Baseline, 1-year and 3-year follow-up structural Magnetic Resonance Imaging scans from normal aging adults (n=32) and MCI converters (n=22) enrolled in the Open Access Series of Imaging Studies (OASIS). The longitudinal changes of cortical features, including cortical volume, thickness, folding and scalp-to-cortex distance (SCD) were examined in both groups. Results: Among the nonlinear trajectory of region-specific morphometry, pronounced ageing effect was only found on the SCD of left DLPFC (t = -2.54, p = 0.02) in MCI converters. the change of cortical folding of left M1 and the change of SCD of left DLPFC from baseline to 3-year follow up could discriminate MCI converters from normal ageing adults. The SCD changes of left DLPFC were significantly correlated with the global cognitive decline. Conclusions: Ageing has a prominent, but differential effect on trajectory of region-specific cortical changes in MCI converters. Our findings suggest that surface-based geometric measures, cortical thickness, gyrification index and SCD in particular, could be considered as valuable imaging markers and surrogate outcomes for the studies of neurodegenerative diseases.


Unusual paradoxical sensory reactivities induced by functional disconnection: An embodied predictive processing model of neurodevelopmental disorder

Hayato Idei, Shingo Murata, Yuichi Yamashita, Tetsuya Ogata


Neurodevelopmental disorders are characterized by heterogeneous and non-specific nature of their clinical symptoms. Multiple behavioral outcomes can occur within one individual, and there are overlaps of symptoms among different disorders. In particular, hyper- and hypo-reactivities to sensory stimulus are included in diagnostic symptoms of autism spectrum disorder and reported across many neurodevelopmental disorders. However, computational mechanisms underlying the co-existence of the unusual paradoxical behaviors remain unclear. Here, using a hierarchical recurrent neural network-driven robot with predictive processing and a learning mechanism, we investigated the effects of functional disconnection on the learning process and subsequent behavioral reactivity to environmental change. In the experiment, the robot first learned multiple visuomotor patterns and estimated dynamic uncertainty (inverse precision) of sensory information. Then, the trained robot was required to react to environmental change via update of higher-level prediction toward minimizing precision-weighted prediction error. The results show that, through learning process, functional disconnection between distinct levels of the hierarchical network could simultaneously weaken precisions of sensory information and higher-level prediction. In that network condition, sometimes bottom-up sensory information was overly strong compared to top-down prediction but at other times the opposite side was true. The disrupted balance between sensory information and higher-level prediction could cause a robot to simultaneously exhibit sensory-dominated and sensory-ignoring behaviors ascribed to sensory hyper- and hypo-reactivities. Our findings might suggest that the co-existence of sensory hyper- and hypo-reactivities in neurodevelopmental disorders can be explained as a consequence of developmental alterations in the process of precision-weighted prediction error minimization. A use of neurorobotics framework might be a useful method for bridging various levels of understandings in neurodevelopmental disorders and understanding how complex behavioral symptoms emerge through dynamic brain-body-environment interactions in the uncertain world.


Decoding the language of speech from neural activity of rats

Motoshige Sato, Nobuyoshi Matsumoto, Yuji Ikegaya


Brain decoding―reading out external environment perceived by an individual from neural activity―makes it possible not only to support physical disabilities but also to expand the abilities of the able-bodied. Previous decoding methods have utilized human-made features such as oscillatory frequencies; however, these methods may have overlooked other important features. We hypothesized that deep neural network (DNN), which extracts latent features from a large amount of raw data without human bias, improves the performance of decoding complex perceptual information. In this study, we focus on decoding human speech from neural activity of the rat auditory cortex. First, we tested whether human speech is so complex that rats cannot behaviorally distinguish the diffidence of languages. For training, a rat was allowed to explore an open field with two nose-poke holes while English or Spanish phrases were randomly presented. Each hole corresponded to the language. When a rat poked its nose into the correct hole, it was rewarded with water. Even though we trained rats for more than a week, the correct rates did not exceed the chance level. Next, to determine whether DNN correctly predicts the languages presented to the rats based on neural activity, we acquired a training dataset of local field potentials (LFPs) in the auditory cortex during speech presentation. We designed a DNN model to learn LFPs and return the language labels. The model performed convolutional operations on the temporal and spatial axes. After trained with the LFP dataset of over 5000 trials of human speech presentation, the model correctly estimated which language were presented with more than 90% of accuracy. We analyzed the output pattern of the final layer of the trained model. The feature vectors were visualized using UMAP, one of the dimensionality reduction methods. We found that the datasets obtained from some rats formed distinct clusters for individual phrase (i.e., the numbers of the clusters and phrases were almost identical), while the other datasets formed two large clusters corresponding to each language. These results suggest that even though rats cannot explicitly discriminate the English and Spanish sound, distinct representations emerge within their auditory cortex and that DNN extracts these representation patterns.


Investigating a data-driven deep-learning approach to simulate whole brain dynamics

Takuto Okuno, Alexander Woodward


To elucidate brain function and pursue pathological studies, the development of in silico simulation offers a number of opportunities to study the properties of the brain in a non-invasive manner. Such methods may be preferable to or complement invasive or non-invasive experimental studies. Notably, there is a long history of MRI (magnetic resonance imaging) based structural and functional analyses of the whole brain, and realistic whole brain simulation using such data remains a challenging topic in neuroscience. Many methods have been investigated to simulate the activities of the brain across different spatio-temporal scales. This can be from single neuron models that reproduce electrical signals, to whole brain dynamics of anatomically distinct regions. In this study, we use a data driven method to analyze the causal relationship between non-linear input and output, rather than capturing the brain area activity using an explicit mathematical model. Our work trains a data-driven model that is able to reproduce brain region activity as it is, and we evaluate its performance over a number of aspects. Our model comprises a fully connected network where each node is a deep neural network. The weights within and between nodes can be trained on multi-modal data, such as fMRI BOLD, calcium imaging, etc., and our model can be easily expanded to large-scale simulation. We show brain simulation results for fMRI data taken of the common marmoset (Callithrix jacchus) as part of Japan’s Brain/MINDS project. Funding: This research was supported by the program for Brain Mapping by Integrated Neurotechnologies for Disease Studies (Brain/ MINDS) from the Japan Agency for Medical Research and Development, AMED. Grant number: JP20dm0207001.


An intelligent robotic system in retail: Machine learning approach

Shubham Sonawani (Arizona State University), Kailas Maneparambil (Intel), Heni Ben Amor (Arizona State University)


In retail, Product placement and inventory management are labor-intensive processes which present various physical challenges. Having a robotic system at a retail workplace is efficient due to its intelligent automation capabilities for labor-intensive and iterative work. However, Robotic systems used in retail often rely on active sensors such as LiDAR or ToF for obstacle avoidance and 3D map generation which increases the overall manufacturing cost of the system. Considering increased manufacturing cost, Authors’ proposes a more intelligent and learning based approach for a custom robotic system which incorporates the use of low cost passive sensors such as monocular vision for high level tasks such as 3D map and point cloud generation. Authors’ have designed a depth prediction framework which uses only RGB images to generate dense depth maps using state-of-the-art convolutional neural networks (CNN). Network has an architecture of encoder-decoder with skip-connections between encoder and decoder layers. Obtained depth maps are used for generation of 3D planograms and point clouds. This information is also used in real-time for obstacle detection and avoidance. Furthermore, Custom made robotic manipulator, designed in collaboration with a startup, is used for manipulation and grasping tasks. Information obtained from predicted depth is used to generate grasp affordance maps using intelligent network for product grasping and placement. Robotic manipulator’s motion planning incorporates the use of dynamic motion primitives (DMP) which are trained on data generated from ROS-Gazebo simulation of a robotic system. Specifically, open source KDL and Moveit libraries are used to train DMPs. Here, sim2real transfer learning is used to obtain a best performance on manipulation and grasping tasks by training on realworld data in addition to simulated data. In order to perform indoor localization and navigation of mobile base efficiently and cost effectively, Bluetooth low energy (BLE) sensors are used. By placing four BLE sensors at corners of room, the mobile base of a robotic system is localized using triangulation of BLE sensors at 10 Hz. This information is also used to correct drift occurring in linear velocities obtained from odometry. Finally, The designed robotic system in the long term should help to tackle the important aspect of retails marketing which is product placement optimization and generating planograms for product sale analytic.


Learning to walk with prosthetics

Geoffrey Clark, Heni Ben Amor


Intuitive and safe mobility is a necessary part of everyday life, and a critical challenge those with prosthetics face every day. Many control algorithms have been successfully developed to increase the quality of life for both transtibial and transfemoral amputees. However state-of-the-art controllers largely neglect latent variables such as internal joint forces which influence key outcomes in patients; due to the extreme complexity of computing and sensing internal human attributes. These challenges call for a transformative view of human-machine symbiosis in which assistive devices automatically adapt their actions and parameters in a subject-specific, continuous, and biomechanically safe fashion. We identify assisted mobility as a problem of shared-control between human and robot and therefore cast it as a symbiotic interaction, in which human and robot collaboratively generate healthy, physical, and bi-directional interactions within specific task constraints. In turn, we propose a different viewpoint on controllers by a.) learning to predict human motion and sensor values, b.) learning to infer latent variables and c.) generating optimal control that takes the predictions of a.) and b.) into account. Our method utilizes a Bayesian framework with optimal control which is lightweight, requires few training samples, and runs on low-compute hardware devices in real-time. The result is a novel human-robot collaboration algorithm that generates appropriate control values for a myriad of such symbiotic interaction tasks, while also taking into account biomechanical considerations and adapting to changes in gait. We start with a machine learning method known for creating robust and accurate probabilistic models of human interactions, Interaction Primitives. This framework encodes the mutual dependencies between interaction partners and is used to infer human motion, sensor values, latent variables, and control signals. An optimal control strategy is then formulated using model predictive control and the probabilistic model to optimize the control output for the latent variable effects on the subject across a finite time horizon. Our control method targets strongly-coupled systems with reciprocal dependencies in which only one system can be actively controlled, i.e., assistive devices. We show that model predictive control is easily integrated into interaction primitives which allows us to optimize control in regards to estimated biomechanical variables.